Optimizing Kedro Pipeline Performance for Web API

At a glance

AAbhishek Bhatia

Hi Team! :kedro:

I have deployed my model inference pipeline as kedro pipelines served as a Dockerized web API.
The implementation around input data, and parameters from input HTTP request is handled, and I am able to run the kedro pipeline by initializing the KedroSession in code ✅

However, I am concerned about kedro pipeline run time per request, which is too high (~1 minute).
Questions:

Is there a way to reduce kedro startup time?
My pipelines have a lot of persistent catalog entries. I have an idea, that if I convert every entry into MemoryDataSet then persistence won't be needed, saving up on I/O time. However, transcoding would be a problem in this case. Any ideas?
Any other ways to speedup kedro init and general pipeline run?

Ideally want to make 0 changes between the actual kedro pipeline and the inference kedro pipeline.

Thanks! 🙂

3 comments

AAnkita Katiyar

Hey Abhishek, with the datasets - the CachedDataset might be helpful https://docs.kedro.org/en/stable/api/kedro.io.CachedDataset.html

YYolan Honoré-Rougé

Hi, kedro-mlflow and kedro-boot are specifically designed for this kind of optimisation for pipeline serving. Even if you don't use the plugins directly, some part of code can serve as inspiration

AAbhishek Bhatia

Hey thanks! First time I head of kedro-boot, sounds really good! Will definitely check this out. Thanks 🙂

Add a reply

Join the Kedro community

Optimizing Kedro Pipeline Performance for Web API