Join the Kedro community

Updated 5 months ago

Optimizing spark code within kedro pipelines

At a glance

Hi team, are there any best practices for optimizing spark code within Kedro pipelines? I have a large pipeline that executes at the last node due to lazy eval. I would like to look at execution plans, etc.

Any suggestions? I suppose this would apply to Polars/Ibis/other similar frameworks.

3 comments

mmarrrcin

You can analyze execution plan in Spark UI, once you run the job.

NNok Lam Chan

This is up to the execution engine - i.e. Polars / Spark is gonna to have completely different execution plan. Ibis is different in that catagory

NNok Lam Chan

So, agree with marrrcin, try Spark UI

Add a reply