The Mysterious Kedro-mlflow Behavior

Question

Hello!

I am having problems with kedro-mlflow. I am running a pipeline (pipeline-name) which terminates without giving any errors. The problem comes when I access the mlflow ui where two runs are shown, on one side the pipeline-name run and on the other side a run with a random name. In the pipeline-name run the model is logged but no parameters are shown, on the other hand in the run with the arbitrary name if the hyperparameters of the model are registered. Moreover, this run never ends even when the execution of the pipeline is finished.

Does anyone know what could be happening?

Thank you!

Ankita Katiyar · Answer

Hey  @Martin Gonzalez , would be helpful if you could share some error logs/screenshots or a repo with which I could reproduce this?

Martin Gonzalez · Answer

@Ankita Katiyar

Martin Gonzalez · Answer

@Ankita Katiyar  It seems to be a problem with the parameters given to the node through the pipeline configuration file. When the node does not receive as input the parameters it does not generate two runs

Ankita Katiyar · Answer

Could you explain a bit more about the parameters? They’re not coming from the  conf/base/parameters.yml  but somewhere else?

Martin Gonzalez · Answer

node(
    func=fit_xgboost,
    name='fit_xgboost',
    inputs=['X_train', 'y_train', 'params:xgboost_conf'],
    outputs='XGBRegressor',
) No, the parameters are from  conf/base/parameters.yml

Ankita Katiyar · Answer

So if there’s a  params:something  in the pipeline inputs, it’s creating a spurious run but not otherwise?

Martin Gonzalez · Answer

yes

Ankita Katiyar · Answer

Okay, I’ll try this out and get back to you. CCing  @Yolan Honoré-Rougé  incase you have any insights

Yolan Honoré-Rougé · Answer

Hi can you give your kedro /mlflow /kedro mlflow versions?

Yolan Honoré-Rougé · Answer

Can you try downgrading each one by one ?

Erwin Paillacan · Answer

i have the same issue kedro==0.19.11
kedro-mlflow==0.14.0
mlflow==2.20.1 Just tested with kedro==0.19.10 and it works fine

Martin Gonzalez · Answer

downgrading kedro to 0.19.10 solves the issue

Yolan Honoré-Rougé · Answer

Wow very surprising, this type of error usually comes from mlflow or kedro-mlflow, not kedro.

Yolan Honoré-Rougé · Answer

Thanks for the info

Yolan Honoré-Rougé · Answer

The bug lies somewhere here :  https://github.com/kedro-org/kedro/compare/0.19.10...0.19.11 , but I don't see anything major so this is likely unexpected. It seems to affect the hooks workflow

Yolan Honoré-Rougé · Answer

Three related yet different bugs in the last 2 days on kedro-mlflow:  https://github.com/Galileo-Galilei/kedro-mlflow/issues/624 ,  https://github.com/Galileo-Galilei/kedro-mlflow/issues/623 ,  https://github.com/Galileo-Galilei/kedro-mlflow/issues/622

Yolan Honoré-Rougé · Answer

@Ankita Katiyar  I strongly suspect that the runner exectuor messes up pluggy  https://github.com/kedro-org/kedro/compare/0.19.10...0.19.11#diff-6cfc8b43afcb8bfb74e8fe5f8ce1403285a191edc6abf63d763e000c572d8c5f

Yolan Honoré-Rougé · Answer

For those having the issue, can you confirm you use the standard runner, and not  ThreadRunner  or  ParallelRunner  ?

Yolan Honoré-Rougé · Answer

https://github.com/kedro-org/kedro/blob/a064c469623d88734bc90ed7c889690d10535f15/kedro/runner/sequential_runner.py#L50-L53

Yolan Honoré-Rougé · Answer

For the record, the following tests are failing :

Yolan Honoré-Rougé · Answer

The pipeline does not seem to be properly closed on error, and metrics create new runs.

Martin Gonzalez · Answer

@Yolan Honoré-Rougé  yes, I was using the sequential runner

Ankita Katiyar · Answer

Thanks for looking into this  @Yolan Honoré-Rougé , i’ll try to figure out what’s going on on Kedro side today!

Yolan Honoré-Rougé · Answer

@Martin Gonzalez  Can you try with  pip install kedro-mlflow==0.14.3  ? It should be fixed

Join the Kedro community

The Mysterious Kedro-mlflow Behavior