Join the Kedro community

Updated 3 days ago

Confusion with Credential Configuration in Kedro 0.19 vs 0.18

Issue Summary
Confusion with Credential Configuration in Kedro 0.19 vs 0.18
Hello Kedro team,
I have encountered an issue regarding the configuration of credentials for accessing storage via abfss in Kedro 0.19.3, which was not present in version 0.18. Here is a summary of the problem:
In Kedro 0.18, I configured the credentials for accessing storage through Spark configurations with Azure Service Principal, and everything worked fine. However, after upgrading to Kedro 0.19.3, the same setup stopped working. After spending a couple of days troubleshooting, I discovered that adding the credentials as environment variables resolved the issue.
My questions are:

  1. Does Kedro 0.19.3 read these environment variables directly?
  2. Is this behavior managed by Kedro itself or by the abfss library?

Additionally, it seems redundant to add the credentials both in the Spark configuration and as environment variables. This redundancy is confusing and feels like a bug rather than a feature. Could you please clarify if this is the intended behavior?
Execution Environment:
  • This is being executed in Databricks.
  • The Spark configurations to use Azure Service Principal are added to the Databricks cluster used. (The cluster configuration includes credentials for multiple storages.)
  • Only one storage credentials can be added as environment variables, but since the spark config authenticates the spark session just filling in these values althugh incorrect allows to access the storages.

Thank you for your assistance!

J
1 comment

hola @Carlos Prieto - Tomtom, thanks for the detailed explanation and sorry you had a bumpy experience. we're looking into this.

I have a few follow-up questions:

  • when on Kedro 0.18, what exact version were you using?
  • I assume in both cases you were using OmegaConfigLoader, is that correct?
  • as far as I understand (but I could be wrong), Kedro doesn't do any magic env variable loading for credentials. apart from PySpark, are there any relevant Python dependencies in your environment?

Add a reply
Sign up and join the conversation on Slack