ISSUE: Deployment of Kedro Pipelines on GCP with Dataproc and Cloud Composer
Description
I am conducting a POC with Kedro on a GCP environment and need assistance deploying my Kedro project in a GCP-compatible format. The goal is to package the Kedro project for execution on Cloud Dataproc clusters.
The intended workflow is as follows:
Unfortunately, I don't have any recent experience on GCP to be able to answer this question. Perhaps https://linen-slack.kedro.org/t/23168580/hey-kedroids-kedro-apologies-in-advance-for-the-long-message#614d3471-83e8-4a83-a29d-d40fa931b1a0 may help? (The approach seems pretty standard; you can package Kedro projects to create the wheel the standard way, and submit that to Dataproc.)
Hey @Mohamed El Guendouz, I can help with this request as I have done this very extensively across GCP Dataproc serverless + Compute Engine + Airflow (Cloud Composer)
I am contributing a GCP Dataproc deployment guide to Kedro's official docs here: https://github.com/kedro-org/kedro/pull/4393 (Currently it's in draft). Also can talk about a lot more than this guide has detailed i.e.