Join the Kedro community

Updated last month

Second runner run fails to save output after pipeline upgrade

Hi, I'm testing after upgrading to 0.19.9 and I found what seems like a bug - after running the pipeline for the second time with a runner (like during test cases) the output is no longer saved (in catalog or returned as a value from pipeline). That wasn't the case in 0.19.8

3
N
A
K
25 comments

Could you share an example? This sounds suspicious, which runner are you using? I only recall a minor change with ThreadRunner

trying to reproduce with spaceflights

I'll make a github ticket

This unfortunately results in making the packaged model servers with kedro-mlflow work only once, then they need a reboot. FYI

when will we start making things that actually can last? one use cutlery, one use batteries, now we get a one use servers >...<

Is this caused by the catalog work?

I was able to reproduce it, looking into it

I leave a comment there. It's unclear to me why it breaks (?) I haven't been able to reproduce the error yet. I got a and b both {} when I run this on GitPod on 0.19.8 and 0.19.9

Is this how your test look like?

def test_data_science_pipeline(caplog, dummy_data, dummy_parameters):

    pipeline = (
        create_ds_pipeline()
        .from_nodes("split_data_node")
        .to_nodes("evaluate_model_node")
    )
    catalog = DataCatalog()
    catalog.add_feed_dict(
        {
            "model_input_table" : dummy_data,
            "params:model_options": dummy_parameters["model_options"],
        }
    )
    
    a = SequentialRunner().run(pipeline, catalog)
    b = SequentialRunner().run(pipeline, catalog)
    assert a == b



change the test and you’ll reproduce

    pipeline = (
        create_ds_pipeline()
        .from_nodes("split_data_node")
        .to_nodes("train_model_node")
    )

evaluate_model_node does not return anything

and there are no free_outputs

ya ok, as the issue describe using the test we have in the starter and I cannot reproduce it.

I updated the comment there with the new test, I still think there is an issue with the memory dataset definition

pipeline.outputs()={'y_test', 'X_test', 'regressor'}
registered_ds=['params:model_options', 'model_input_table']
memory_datasets={'model_input_table', 'params:model_options'}
free_outputs={'y_test', 'X_test', 'regressor'}


pipeline.outputs()={'y_test', 'X_test', 'regressor'}
registered_ds=['X_test', 'params:model_options', 'model_input_table', 'X_train', 'regressor', 'y_test', 'y_train']
memory_datasets={'model_input_table', 'params:model_options'}
free_outputs=set()

I can see now the 2nd run we return nothing for free_outputs, but I expect y_test', 'X_test', 'regressor' in the memory_dataset, but it's not. That is why the free_output is missing them at the end.

I think the issue is with the shallow copy instead. Those free_outputs are initialised before the copy was made, and thus making incorrect reference.

I don't understand the need of the shallow copy - but by shifting all those free_outputs declaration after the shallow copy, I get the expected output correctly.

but by shifting all those free_outputs declaration after the shallow copy, I get the expected output correctly


can you please give an example of what you mean? moving shallow copy should not change it and in the new catalog this method will be removed anyway

This unfortunately results in making the packaged model servers with kedro-mlflow work only once, then they need a reboot. FYI

This is due to kedro-mlflow packaging objects as a pickle. When an object is loaded, its structure is defined with the class structure (DataCatalog) at saving time, but it runs with the one in your environment at loading time. If there is a mismatch, the object does load , or behave like the class is defined aliasing time (e.g. here with the behaviour of the last version of kedro).

Specifically here, once the bug is fixed you can just upgrade your kedro version and it should resume working normally (no need to retrain the whole model. More generally, this issue on catalog serialisation should help kedro-mlflow model be more stable over time and not break between kedro-version (e.g. just because a private internal attribute changes)

Add a reply
Sign up and join the conversation on Slack