Join the Kedro community

M
M
M
D
M
Members
Rafael TEsta
R
Rafael TEsta
Offline, last seen 5 days ago
Joined October 10, 2024

Hi everyone! I have a couple of questions about Kedro:

  1. I'm using an external Java tool to convert XML to linked data in one of my nodes, and the tool produces an output, but it's created outside of the Python function. Right now, I'm using a dummy dataset as an output and then using that as an input for the next node to make Kedro Viz visualize the connection properly. However, this feels a bit clumsy. Is there a more elegant way to sequentially connect nodes in Kedro without requiring a dataset in between?
  2. I would like to use Kedro for a project that performs the ETL for multiple institutes. I'm planning to use namespaces since the ETL process is similar for most institutes. After running the individual pipelines, there is part of the ETL that can either be run with the output from a single institute or sometimes needs to be run with the outputs from all institutes together. Currently, with a pure Python approach, we output each institute's data into a shared directory and then run the shared part using the content of that directory. However, Kedro doesn't allow multiple nodes to output to the same dataset (folder in this case). How could I connect the shared pipeline with each institute's pipeline in this case?
Thanks in advance for your help!

5 comments
R
R