Hey Everyone
Interested to know from you people which orchestration service you guys prefer to run kedro in production environments and how has been the experience so far
Recently I have been trying to run kedro on kubeflow and have been facing multiple issues.
depends on your use case, but for A LOT of use cases it's a total overkill. It's especially difficult to install and maintain. If you treat it as a MLOps Platform and you plan to utilize as much of the features as possible (Katib/KServe/Notebooks/Pipelines) etc., maybe you will benefit from it (provided that you have manpower to maintain it). For orchestration only - there are better options.
Thanks for responding to my post. Sorry for late response from my end.
Couple of issues we are facing when I am trying to deploy kedro pipeline on kubeflow
1. Are your pipelines super I/O heavy? Was EFS introduced for efficiency or just for the ability to use ReadWriteMany
in K8s?
2.b: I haven't worked with argo/kfp for a while, I don't remember exactly where it's configured, but it's definitely doable.
3. Sounds like something you could contribute to kedro-kubeflow then.
4. Kedro KFP plugin is not most up-to-date as of now. There are definitely some missing bits. The biggest gain from using Kedro-Kubeflow was to detach the work done by DS teams (they stick to Kedro and are able to run things locally / iterate fast) from the work done by MLE/MLOps - they provide infra and Kedro is a common pipeline'ing language to marry those 2 worlds together.
I'm coming back to the Kubeflow question. AFAIK, pipelines in GCP VertexAI are authored using the Kubeflow Pipelines SDK, right? so there isn't really a choice
kfp
had 6.5M downloads last month so it definitely looks like a popular pipeline framework
Hi , Vertex AI uses kfp
as a default framework, that’s true but regarding the integration with kedro, there’s a separate plugin for that - kedro-vertexai
which is more up to date and uses one of the recent kfp
versions.
Apart from the “pipeline translation” logic which works very similar to the kedro-kubeflow
one there are differences regarding authentication, scheduling and parameter handling between vertex and standalone kubeflow clusters.