Join the Kedro community

Home
Members
Alexandre Ouellet
A
Alexandre Ouellet
Offline, last seen 2 weeks ago
Joined September 26, 2024

Hey there!
When using kedro azureml with the AzureMLDataset type, it seems to be using the fsspec (as described by the documentation). Is there a way to use the "mode" paramter in AzureML's command, and not have to download each file individually (through fsspec), but rather have them in mode rw_mount?

5 comments
m
A
A

To keep the other thread focused : Is there a way to manage a dataset of about 1 million files in AzureML? The files are about 4k each of binary data, and are entirely independent from each other.

7 comments
A
N
P

Hey there! Quick question about kedro-azureml. We are using AzureML, and we'd like to use AzureMLAssetDataset with dataset factories.
After a lot of headach and debugging, it seems impossible to use both, as the way credentials are passed to the AzureMLAssetDataset is done through a hook (after_catalog_created), but the issue is that if you use a dataset_patterns (as in, declare your dataset as "{name}.csv" or something similar), the hook is called, but the patterned dataset is not instanciated yet.
After all that, a before_node_run is called, and then there is a AzureMLAssetDataset._load() called, but the AzureMLAssetDataset.azure_config setter hasn't been called yet (as it is called only in the after_catalog_created hook). At first glance, it seems like a kedro-azureml issue, as AzureMLAssetDataset._load() can be called without the setter being called when used as a dataset factory. But also, it might be a kedro issue, as I think there should be an obvious way to setup credentials in that specific scenario, and I don't quite see it from the docs on hook either

15 comments
A
J
R
E