Join the Kedro community

Updated 6 days ago

Optimizing Kedro Pipeline Performance for Web API

Hi Team! :kedro:

  • I have deployed my model inference pipeline as kedro pipelines served as a Dockerized web API.
  • The implementation around input data, and parameters from input HTTP request is handled, and I am able to run the kedro pipeline by initializing the KedroSession in code ✅


However, I am concerned about kedro pipeline run time per request, which is too high (~1 minute).
Questions:

  1. Is there a way to reduce kedro startup time?
  2. My pipelines have a lot of persistent catalog entries. I have an idea, that if I convert every entry into MemoryDataSet then persistence won't be needed, saving up on I/O time. However, transcoding would be a problem in this case. Any ideas?
  3. Any other ways to speedup kedro init and general pipeline run?

Ideally want to make 0 changes between the actual kedro pipeline and the inference kedro pipeline.

Thanks! 🙂

A
Y
A
3 comments

Hey Abhishek, with the datasets - the CachedDataset might be helpful https://docs.kedro.org/en/stable/api/kedro.io.CachedDataset.html

Hi, kedro-mlflow and kedro-boot are specifically designed for this kind of optimisation for pipeline serving. Even if you don't use the plugins directly, some part of code can serve as inspiration

Hey thanks! First time I head of kedro-boot, sounds really good! Will definitely check this out. Thanks 🙂

Add a reply
Sign up and join the conversation on Slack