Join the Kedro community

Updated 3 weeks ago

The Mysterious Kedro-mlflow Behavior

Hello!

I am having problems with kedro-mlflow. I am running a pipeline (pipeline-name) which terminates without giving any errors. The problem comes when I access the mlflow ui where two runs are shown, on one side the pipeline-name run and on the other side a run with a random name. In the pipeline-name run the model is logged but no parameters are shown, on the other hand in the run with the arbitrary name if the hyperparameters of the model are registered. Moreover, this run never ends even when the execution of the pipeline is finished.

Does anyone know what could be happening?

Thank you!

1
A
M
Y
24 comments

Hey @Martin Gonzalez, would be helpful if you could share some error logs/screenshots or a repo with which I could reproduce this?

@Ankita Katiyar It seems to be a problem with the parameters given to the node through the pipeline configuration file. When the node does not receive as input the parameters it does not generate two runs

Could you explain a bit more about the parameters? They’re not coming from the conf/base/parameters.yml but somewhere else?

node(
    func=fit_xgboost,
    name='fit_xgboost',
    inputs=['X_train', 'y_train', 'params:xgboost_conf'],
    outputs='XGBRegressor',
)
No, the parameters are from conf/base/parameters.yml

So if there’s a params:something in the pipeline inputs, it’s creating a spurious run but not otherwise?

Okay, I’ll try this out and get back to you. CCing @Yolan HonorĂ©-RougĂ© incase you have any insights

Hi can you give your kedro /mlflow /kedro mlflow versions?

Can you try downgrading each one by one ?

i have the same issue

kedro==0.19.11
kedro-mlflow==0.14.0
mlflow==2.20.1

Just tested with
kedro==0.19.10
and it works fine

downgrading kedro to 0.19.10 solves the issue

Wow very surprising, this type of error usually comes from mlflow or kedro-mlflow, not kedro.

The bug lies somewhere here : https://github.com/kedro-org/kedro/compare/0.19.10...0.19.11, but I don't see anything major so this is likely unexpected. It seems to affect the hooks workflow

For those having the issue, can you confirm you use the standard runner, and not ThreadRunner or ParallelRunner ?

For the record, the following tests are failing :

The pipeline does not seem to be properly closed on error, and metrics create new runs.

@Yolan Honoré-Rougé yes, I was using the sequential runner

Thanks for looking into this @Yolan HonorĂ©-RougĂ©, i’ll try to figure out what’s going on on Kedro side today!

@Martin Gonzalez Can you try with pip install kedro-mlflow==0.14.3 ? It should be fixed

Add a reply
Sign up and join the conversation on Slack