Hi everyone,
I’m a Data Engineer, and my team is working on multiple pipelines, each addressing different use cases (1 use case = 1 pipeline). We have both ingestion pipelines and export pipelines delivering data to various clients.
We’re considering grouping certain nodes into a common library to be shared across these pipelines. I wanted to ask if this is considered a good practice within the Kedro framework. If so, could you recommend an approach or best practices for implementing this?
Additionally, do you have any recommendations for structuring a Kedro project when working with multiple pipelines like this?
Thanks in advance for your help!
Best regards,
El Guendouz Mohamed
Hello Mohamed 🙂
With my small kedro experience, we managed to reuse common node in our pipeline (we have globally the same use than you).
To do that, we used the "namespace" feature of kedro for each pipeline.
Why? Because when you run your pipeline, you can't use nodes with same name in two differents pipeline. Namespace helps you to manage that, and you can have different config according to namespace 🙂
Hi , Thanks for using Kedro. As Theo suggested, you can resuse and structure your pipelines using namespaces and modular pipelines. Please find below docs for more info -
Namespaces - https://docs.kedro.org/en/stable/nodes_and_pipelines/namespaces.html
Modular Pipelines - https://docs.kedro.org/en/stable/nodes_and_pipelines/modular_pipelines.html
Thank you
Thank you for your advice on using namespaces in Kedro to manage common nodes across pipelines. I appreciate the links and will apply these practices to my projects.
Thanks again for your help!
Best regards,
Mohamed El Guendouz