Join the Kedro community

Home
Members
Luis Chaves Rodriguez
L
Luis Chaves Rodriguez
Offline, last seen 13 hours ago
Joined January 3, 2025

Would kedro users be opposed defining nodes with decorators? I have written a simple implementation but as I've only recently started using kedro I wonder if I'm missing anything:

The syntax would be:

from kedro.pipeline import Pipeline, node, pipeline

@node(inputs=1, outputs="first_sum")
def step1(number):
   return number + 1

@node(inputs="first_sum", outputs="second_sum")
def step2(number):
   return number + 1 

@node(inputs="second_sum", outputs="final_result")
def step3(number):
   return number + 2

pipeline = pipeline(
   [
       step1,
       step2,
       step3,
   ]
)

the node name could be inferred from the function name

31 comments
N
d
L
D
B

How do you avoid over DRY ("Don't Repeat Yourself") using Kedro? I find given the fairly opinionated syntax and project structure that is proprosed it's easy to DRY bits of code that would be best not DRY (e.g. preprocessing code). I wonder if anyone else has had similar thoughts

13 comments
D
d
L

Hi kedro community!! I have encountered an issue when working with kedro within a marimo notebook (I think the issue would be just the same in a jupyter notebook). Basically, I initially was working on my notebook by calling it from the command line from the kedro project root folder, something like: marimo edit notebooks/nb.py where my folder structure is something like:

├── README.md
├── conf
│   ├── base
│   ├── local
├── data ...
├── notebooks
│   ├── nb.py
├── pyproject.toml
├── requirements.txt
├── src ... 
└── tests ...
Within nb.py I have a cell that runs:
from kedro.io import DataCatalog
from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings
from pathlib import Path
conf_loader = OmegaConfigLoader(
    conf_source=Path(__file__).parent /settings.CONF_SOURCE,
    default_run_env = "base"
)

catalog = DataCatalog.from_config(conf_loader["catalog"], credentials=conf_loader["credentials"])

and later...
weekly_sales = pl.from_pandas(
    catalog.load("mytable")
)

The issue is, within the catalog all the filepaths are absolute and assume that wherever the catalog is being used from is using the Kedro project root level. the conf_source argument in the OmegaConfigLoader instance is an absolute path (e.g. conf/base/sql/somequery.sql or data/mydataset.csv so if I run my notebook from the root of my kedro project, all is fine but I were to run: cd notebooks; marimo edit nb.py then catalog.load will attempt to load the query or dataset from notebooks/conf/base/sql/somequery.sql

Is it clear?

PD: please don't ask me why there is SQL code within the conf folder 😅, it's moving soon

10 comments
J
L
R

Hey, how do people use kedro at scale? I've read a few tutorials on how to use kedro for single projects but none on how to use it at scale. To me there would be an inherit benefit in creating modules with the pipeline step logics (so like shared nodes.py) and for common tasks using those rather than writing them in the pipeline specific nodes.py, does anybody do this?

I am keen to learn how people make the most out of kedro

4 comments
N
L