Join the Kedro community

Home
Members
Fazil Topal
F
Fazil Topal
Offline, last seen 9 hours ago
Joined September 12, 2024

Hi guys,

Trying to run kedro viz, and I am getting some strange errors like below:

(projx) ⋊> ~/P/projx on master ⨯ uv run --with kedro-viz kedro viz run                                                          14:29:10
   Built projx @ file:///home/ftopal/Projects/projx
Uninstalled 1 package in 0.68ms
Installed 1 package in 1ms
Installed 98 packages in 109ms
[02/18/25 14:30:28] INFO     Using 'conf/logging.yml' as logging configuration. You can change this by setting the       __init__.py:270
                             KEDRO_LOGGING_CONFIG environment variable accordingly.                                                     
WARNING: Experiment Tracking on Kedro-viz will be deprecated in Kedro-Viz 11.0.0. Please refer to the Kedro documentation for migration guidance.
INFO: Running Kedro-Viz without hooks. Try `kedro viz run --include-hooks` to include hook functionality.
Starting Kedro Viz ...
[02/18/25 14:30:31] INFO     Using 'conf/logging.yml' as logging configuration. You can change this by setting the       __init__.py:270
                             KEDRO_LOGGING_CONFIG environment variable accordingly.                                                     
Process SpawnProcess-1:
Traceback (most recent call last):
  File "/home/ftopal/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/home/ftopal/.local/share/uv/python/cpython-3.10.16-linux-x86_64-gnu/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/ftopal/.cache/uv/archive-v0/TA93jbcQ_9KplZlKrI4mO/lib/python3.10/site-packages/kedro_viz/server.py", line 121, in run_server
    load_and_populate_data(
  File "/home/ftopal/.cache/uv/archive-v0/TA93jbcQ_9KplZlKrI4mO/lib/python3.10/site-packages/kedro_viz/server.py", line 70, in load_and_populate_data
    populate_data(data_access_manager, catalog, pipelines, session_store, stats_dict)
  File "/home/ftopal/.cache/uv/archive-v0/TA93jbcQ_9KplZlKrI4mO/lib/python3.10/site-packages/kedro_viz/server.py", line 44, in populate_data
    data_access_manager.add_pipelines(pipelines)
  File "/home/ftopal/.cache/uv/archive-v0/TA93jbcQ_9KplZlKrI4mO/lib/python3.10/site-packages/kedro_viz/data_access/managers.py", line 124, in add_pipelines
    self.add_pipeline(registered_pipeline_id, pipeline)
  File "/home/ftopal/.cache/uv/archive-v0/TA93jbcQ_9KplZlKrI4mO/lib/python3.10/site-packages/kedro_viz/data_access/managers.py", line 180, in add_pipeline
    input_node = self.add_node_input(
  File "/home/ftopal/.cache/uv/archive-v0/TA93jbcQ_9KplZlKrI4mO/lib/python3.10/site-packages/kedro_viz/data_access/managers.py", line 259, in add_node_input
    graph_node = self.add_dataset(
  File "/home/ftopal/.cache/uv/archive-v0/TA93jbcQ_9KplZlKrI4mO/lib/python3.10/site-packages/kedro_viz/data_access/managers.py", line 371, in add_dataset
    graph_node = GraphNode.create_data_node(
  File "/home/ftopal/.cache/uv/archive-v0/TA93jbcQ_9KplZlKrI4mO/lib/python3.10/site-packages/kedro_viz/models/flowchart/nodes.py", line 140, in create_data_node
    return DataNode(
  File "/home/ftopal/.cache/uv/archive-v0/TA93jbcQ_9KplZlKrI4mO/lib/python3.10/site-packages/pydantic/main.py", line 214, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 2 validation errors for DataNode
kedro_obj.is-instance[Node]
  Input should be an instance of Node [type=is_instance_of, input_value=[projx.models.llm.LLM(bac.../logs'), _logging=True)], input_type=list]
    For further information visit <a target="_blank" rel="noopener noreferrer" href="https://errors.pydantic.dev/2.10/v/is_instance_of">https://errors.pydantic.dev/2.10/v/is_instance_of</a>
kedro_obj.is-instance[AbstractDataset]
  Input should be an instance of AbstractDataset [type=is_instance_of, input_value=[projx.models.llm.LLM(bac.../logs'), _logging=True)], input_type=list]
    For further information visit <a target="_blank" rel="noopener noreferrer" href="https://errors.pydantic.dev/2.10/v/is_instance_of">https://errors.pydantic.dev/2.10/v/is_instance_of</a>

Any idea why i am getting this now? Complaining about some input values but everything work fine in kedro run

8 comments
E
F
R

Hi everyone,

I am trying to build the following node:

node(
    func=lambda x: None,
    name="test-agent",
    inputs="master:claude-sonnet::worker:mistral-large",
    outputs=None
)

Here I customized my data catalog loader as follows:

from kedro import io


class DataCatalog(io.DataCatalog):
    def load(self, name: str, version: str | None = None) -> Any:
        if "::" in name:    # We have agent-llm design
            params = {}

            for n in name.split("::"):
                param, ds = n.split(":")
                params[param] = super().load(name=ds, version=version)

            return AgentLLM(**params)

        else:
            return super().load(name=name, version=version)

Essentially I have different models defined in my catalog. So claude-sonnet and mistral-large defined in my catalog, along 10 other different models. I wanted to customize the behavior to have multiple models passed into a new class.

Now I was just passing them one by one for now but looks like i have cases where I need to pass 3-4 different models and create intermediate classes which uses them under the hood. I wanted to improve this setup by having the following new syntax like "master:claude-sonnet::worker:mistral-large" which creates a AgentLLM class with master and worker parameter. Unfortunately this doesn't work as I expect since I am hitting the following error in runner code before it even reaches to my catalog function:

ValueError: Pipeline input(s) {'master:claude-sonnet::worker:mistral-large'} not found in the DataCatalog

What would be a good way to overcome this behaviour?

1 comment
F

In one of the threads i was told this command should work in any project:

uvx --with kedro-viz kedro viz run --lite however it doesn't work for me. Any clue why or what's the recommended way of using it?

4 comments
J
F

Hi all,

Is there a reason we don't have caching support in PartitionDataset ? Image running an expensive computation but in the middle an error occurs and re-run is needed. I would assume having a logic to resume where we left off would be quite handy instead of starting from all over again. Specially in the case of return dict of Callable for kedro to invoke. I can certainly override this but i was wondering if there was a special reason why we don't have this yet

5 comments
L
d
F

hey everyone,

Is there a way to run kedro-viz on docker without actually installing the lib? I am asking because i wanted to keep the env a bit clean and I thought docker for viz would be nice. Did anyone do that before?

45 comments
1
J
F
N
d
R

Hi everyone,

I have the following files:
settings.py

CONFIG_LOADER_ARGS = {
    "base_env": "base",
    "default_run_env": "local",
    "config_patterns": {
        # Also include models.yml in the catalog
        "catalog": [
            "catalog*",
            "models*",
            "catalog*/**",
            "models*/**",
            "**/catalog*",
            "**/models*",

        ],
    }
}
conftest.py
from kedro.framework.project import settings


@fixture(scope='session')
def config_loader():
    kwargs = settings.CONFIG_LOADER_ARGS
    kwargs.update(env="test", base_env="base", default_run_env="test")

    return OmegaConfigLoader(
        conf_source=str(PROJECT_PATH / settings.CONF_SOURCE), **kwargs,
    )

I expect my settings valued to loaded but this doesnt seem to be the case as I can't get config patters in the dict itself. Any idea why is that?

3 comments
F
N

Hey guys,

Me again 😄 I had a question regarding parquet dataset itself. I often encounter issues with custom datatypes during saving. For instance if i have a custom class in my dataframe i would like to still keep this as is - the reason why i use parquet -. I know there needs to be customer serializer/deserilizar code required to do this. I can for sure do it in my code but since it's io related, i believe it should be done in the dataset definition where i can somehow point to my custom which gets serialized before writing to file. I will work on the extended version now, i was wondering if it was discussed before?. i am happy to push this as a PR later

16 comments
F
N

hey all,

I noticed something that when i define the following node:

node(
    func=sample_func,
    # name="sample_func"    -> does not work when this is commented
    inputs="epubdf",
    outputs="result"
)

and then run this test:
kedro_session.run(node_names=['sample_func'])
I get an error saying name doesnt exists but when i specify name parameter in node itself it works. I thought if i don't provide it, name would be equal to func name itself, no?

9 comments
N
F
M
J

Hey everyone,

I was wondering if there was a way to do the node declaration as follows:

pipelines/data.py

@node(inputs=..., outputs=...)
def preprocess(...):
    ...
The reason why i ask is because my nodes are growing and it is becoming a bit tiring to go import my function, do the kedro wiring and occasionally I have typos as i forgot about the function signature. Something like this would be quite nice to declare.

4 comments
J
F

Hey guys,

Do we know how to pass credentials to a node in kedro? Is it only meant to be accessed by a dataset loader? I have a code that makes API calls (LLMs) and either I get them set as env or pass them from my local credentials? I can manually load for sure but i was looking for a better way. Maybe similar to parameters, credentials: openai?

25 comments
I
N
F
J