Hey Kedro community,
I'm currently working on a project trying to use kedro_mlfow
to store kedro_datasets_experimental.netcdf
as artifacts. Unfortunatly I can't make it work.
The problem seems to be path related:
kedro.io.core.DatasetError: Failed while saving data to dataset MlflowNetCDFDataset(filepath=S:/…/data/07_model_output/D2-24-25/idata.nc, load_args={'decode_times': False}, protocol=file, save_args={'mode': w}). 'str' object has no attribute 'as_posix'I tried to investigate it to the best of my abilities and it seems to have to do with the initialization of
NetCDFDataset
. Most Datasets inherit from AbstractVersionedDataset
and will call __init__
with its _filepath as str.NetCDFDataset
is missing it and so the PurePosixPath
is not created. If this should be the problem in the end I don’t know but it is the point where other datasets have its path set. In the meantime I thought it might be because mlflow isn't capable of tracking Datasets which don't inherit from AbstractVersionedDataset
but in kedro-mlfow documentation it says MlflowArtifactDataset
is a wrapper for all AbstractDatasets
.self._filepath = PurePosixPath(filepath)
myself in the sitepackage but getting a Permission error on saving and that’s were my journey has to end. Would have been too good if this oneline would have made it^^"{dataset}.idata": type: kedro_mlflow.io.artifacts.MlflowArtifactDataset dataset: type: kedro_datasets_experimental.netcdf.NetCDFDataset filepath: data/07_model_output/{dataset}/idata.nc save_args: mode: a load_args: decode_times: Falsenode.py
def predict(model, x_data): idata = model.predict(x_data) return az.convert_to_dataset(idata)pipeline.py
pipeline_inference = pipeline( [ node( func=predict, inputs={ "model": f"{dataset}.model", "x_data": f"{dataset}.x_data", }, outputs=f"{dataset}.idata", name=f"{dataset}.predict_node", tags=["training"], ), ] )