Join the Kedro community

Updated 2 weeks ago

Exploring orchestration options: Kedro, Dagster, and beyond.

does kedro integrate well with dagster? What flavours of orchestration do you guys enjoy? (Airflow, dagster, prefect, others)

We currently have very rudimentary orchestration with Azure Data Factory and I am hoping to push my team onto a nice orchestrator as our needs grow

3
d
L
G
7 comments

I've used airflow 2.x before and it was fine but I've heard many great things about both dagster and prefect

@Luis Chaves Rodriguez kedro-dagster is still a WIP but I am hoping I'll be able to release something usable within the next month.

Airflow is also very widely adopted and is a mature Kedro plugin (kedro-airflow). Good choice if you need extensive scheduling options and already have Airflow expertise.

i'm working on integrating Kedro into our existing Prefect deployment. it's going well so far, but one point of complexity is how to connect Kedro parameters with Prefect parameters (so you can supply then at runtime). I've got a solution worked out, but I'd be curious how others have handled this for other orchestrators. Happy to share more if there's interest.

@Luis Chaves Rodriguez In short, there's no reason why Kedro can't integrate very well with Dagster. Conceptually, it maps very cleanly, much like dbt or SQLMesh. Very excited about @Guillaume Tauzin’s work.

From a Dagster internal perspective, there's so far been limited interest (or, probably more accurately, awareness), I think largely because there aren't any paying customers who have asked for it (and also relatively few OSS users so far). Hopefully there will be more interest over time—I've moved to the team within Dagster that works more on integrations, and it may also be more relevant when we try to expand to more dedicated ML workflow support.

For what it's worth, I think the Kedro-Dagster integration should be much better than the Kedro-Airflow integration once ready for use, just comparing the approaches. Kedro-Airflow follows a fairly naive/generic orchestrator mapping approach (the same as https://docs.kedro.org/en/stable/deployment/prefect.html and many of the other deployment guides), whereas Kedro-Dagster translates each piece to Dagster-native concepts. This means Dagster has a lot greater understanding of what you're doing in your Kedro pipeline, whereas most of the current orchestrator integrations treat groups of nodes essentially as black boxes to execute.

Sounds exciting! I've read glowing reviews of Dagster vs alternatives, if anyone has experience self hosting Dagster I'd love to hear from you!

Add a reply
Sign up and join the conversation on Slack