Join the Kedro community

Updated 5 days ago

Trouble Running Kedro From Docker Build

At a glance

A community member is having trouble running their Kedro project from a Docker build. They are using MLflow and the kedro_mlflow.io.artifacts.MlflowArtifactDataset, and the artifacts are trying to access the local Windows path instead of the container's path. The community member is asking for additional settings to resolve this issue. The mlflow_tracking_uri is set to null.

In the comments, another community member suggests that the issue was solved by explicitly setting the mlflow_tracking_uri.

Hi guys,

I am having trouble to run my kedro from a docker build. I'm using MLflow and the kedro_mlflow.io.artifacts.MlflowArtifactDataset

I followed the instructions for building the container from kedro-docker repo but when running, those artifacts want to access my local windows path instead of the containers path. Do you guys know what additional settings I have to make? All my settings in are pretty much vanilla. The mlflow_tracking_uri is set to null

"{dataset}.team_lexicon":
  type: kedro_mlflow.io.artifacts.MlflowArtifactDataset  
  dataset:
    type: pandas.ParquetDataset  
    filepath: data/03_primary/{dataset}/team_lexicon.pq 
    metadata:
      kedro-viz:
        layer: primary  
        preview_args:
            nrows: 5 

Traceback (most recent call last):
  
kedro.io.core.DatasetError: Failed while saving data to dataset MlflowParquetDataset(filepath=/home/kedro_docker/data/03_primary/D1-24-25/team_lexicon.pq, load_args={}, protocol=file, save_args={}).
[Errno 13] Permission denied: '/C:'

Marked as solution

I should use the AI more ๐Ÿซ . Solved by explicitly setting the mlflow_tracking_uri

View full solution
P
d
3 comments

I should use the AI more ๐Ÿซ . Solved by explicitly setting the mlflow_tracking_uri

was that our AI or the Ask AI button above?

It was with the ask ai button. Maybe I was lucky because it referenced the base integration of mlflow in the docs.

Based on the error description, it appears you're encountering an issue with path handling when using MlflowArtifactDataset in a Docker container. When using MLflow with Kedro in Docker, you need to ensure proper configuration:
  1. Create a mlflow.yml configuration file in your conf/local directory
  2. Configure the MLflow tracking URI in the configuration file:
  3. server:
  4. mlflow_tracking_uri: mlflow_runs
For Docker deployment, make sure your Docker container has the correct volume mounts and working directory set up. The working directory should be specified as the directory containing your Kedro project on the container.

Now unfortunately my runs aren't logged when running the pipeline. Do you have any idea on why that might be?

**when I'm checking the running container mlflow still created mlruns defautl folder and not the uri i provided by setting
server:
  mlflow_tracking_uri: mlruns_docker
even trying to use the default doesnt work.
server:
  mlflow_tracking_uri: mlruns

Add a reply
Sign up and join the conversation on Slack