Jonathan Dekermanjian

·

N

Solved

Parallel Runner Memory Utilization Issue with Modular Pipelines

Hi everyone,

I have a question about the ParallelRunner. I have a modular pipeline setup with lots of pipelines that I want to run in parallel. At the end of each one of the modular pipelines the computations are saved to a PostgreSQL database. Now what I am noticing is that even though some pipelines are completing, the memory utilization is never going down. A little more info on my setup, I am running 32 worker on a 72 core machine and I have thousands of modular pipelines that I want to run in parallel. My question is this: Does the ParallelRunner hold on to the python objects until ALL of the modular pipelines are complete?

21 comments

d

N

J

Join the Kedro community

Parallel Runner Memory Utilization Issue with Modular Pipelines