hey everyone,
Is there a way to run kedro-viz on docker without actually installing the lib? I am asking because i wanted to keep the env a bit clean and I thought docker for viz would be nice. Did anyone do that before?
so you can basically compile a portable version that's just a single page app
https://docs.kedro.org/projects/kedro-viz/en/stable/platform_agnostic_sharing_with_kedro_viz.html
hmm but i would still have to run build everytime i update my command no? I was asking for myself, I don't need to share with someone else. I would then put this into my projects docker compose file so that I can run kedro viz in a isolated docker image, not in my local env
I'm not sure if that's possible at the moment. Kedro-Viz works by reading the Kedro project and creating JSON endpoints, which the frontend uses for visualization. Even if you host the frontend in a Docker container, you'll still need Kedro-Viz as a library to convert the Kedro project into JSON files.
One option is maybe you keep your project clean. and if your project is on github, you could use https://github.com/kedro-org/publish-kedro-viz -- this would do the kedro-viz installation on the Github Ci and host your kedro-viz on Github pages
I don't think there is an official way, but there's nothing to stop you from creating your own docker to run kedro-viz in a container.
To do that you will need both project and kedro-viz dependencies inside your docker.
It's also completely fine to run kedro-viz in a separate virtual env, more or less the same idea of Docker depends how much isolation you are looking for
thanks, github page also cool, might try that later. yes i will possibly do that, i was just wondering if anyone did that before but yeah i can write some docker config files to do that ๐
if you get a nice solution working please share ๐ always keen to understand how best to do this
you may also be interested in kedro-viz --lite
which was just shipped and builds the DAG through ast introspection without actually executing it , because you can now run kedro-viz without any of the actual dependencies (other than Kedro) installed
ideally, uvx --with kedro-viz kedro viz run --lite
should work in <i>any</i> project. I just tested it.
also since was asking specifically about Docker:
$ cat Dockerfile FROM python:3.9-slim RUN pip install uv && uv pip install --system kedro-viz kedro EXPOSE 4141 WORKDIR /app ENTRYPOINT ["kedro", "viz", "run", "--lite", "--host", "0.0.0.0"] $ docker build -t kedro-viz-lite . ... $ docker run -p 4141:4141 -v ~/Projects/demo:/app kedro-viz-lite
Thanks for this, I've tried different combo and somehow i end up getting the following error:
File "/app/src/projx/models/text/base.py", line 34, in <module> class AnthropicAssistant(BaseMessage): File "/opt/conda/envs/py/lib/python3.10/dataclasses.py", line 1184, in dataclass return wrap(cls) File "/opt/conda/envs/py/lib/python3.10/dataclasses.py", line 1175, in wrap return _process_class(cls, init, repr, eq, order, unsafe_hash, File "/opt/conda/envs/py/lib/python3.10/dataclasses.py", line 908, in _process_class for b in cls.__mro__[-1:0:-1]: File "/opt/conda/envs/py/lib/python3.10/unittest/mock.py", line 643, in __getattr__ raise AttributeError("Mock object has no attribute %r" % name) AttributeError: Mock object has no attribute '__mro__'For some reason it leads to dataclasses and errors out there. Code works fine normally, i am not sure why it does that. I thought it's
uv
stuff but replicating the same env in the container also results in the same error. I'll have a look later๐ this is our fault for sure, kedro viz --lite
uses unittest Mock
. cc
do you mind opening an issue on Kedro Viz about this?
Ahh, okay i was super confused ๐ I will open soon.
I tried without lite and still getting errors about my custom dataset definitions. I played with PYTHONPATH but no luck
Example: kedro.io
.core.DatasetError: Class '
projx.models.audio.io
.LargeModel' not found, is this a typo?
projx.models.audio.io.LargeModel
open a python terminal:from projx.models.audio.io import LargeModel
that must be a separate error I'm sure. if python -c "from
projx.models.audio.io
import LargeModel"
works but kedro run
doesn't, then you have some problem with your installation
It was inside the docker, looks like some other deps was missing, when I run that in python i got the no module named elevenlabs
so installing that fixed it. I wonder why that error is not thrown in kedro tho ๐ค
For some context on kedro viz --lite
, it only mocks dependencies within your kedro project. It does not mock any transitive dependencies. For this - i got the no module named elevenlabs so installing that fixed it. I wonder why that error is not thrown in kedro tho
do you mean kedro viz --lite
did not raise an error or kedro run
?
Well actually both of them works now, missing dependency had some different error outputs. Not sure why
I do have a different problem now ๐
viz: image: projx:viz build: context: . target: viz entrypoint: [ "bash" ] command: - -c - "kedro viz run --host 0.0.0.0" ports: - 4141:4141 volumes: - ./:/appThis returns error as
bash: line 1: kedro: command not found
but when i comment the command
section, then run kedro viz run
inside the container it works. Does anyone have a clue? I feel like im missing something super obvious here ๐
Yeah im also getting the error when i add the line to dockerfile ENTRYPOINT ["kedro", "viz", "run", "--lite", "--host", "0.0.0.0"]
what's the best way to add kedro to containers bin path?
command: - -c - "source ~/.bashrc && kedro viz run --host 0.0.0.0"Try if the above command works. If not, can you try installing kedro and kedro viz globally while creating the image ?
what's the best way to add kedro to containers bin path?You are not suppose to do that. Installing kedro-viz, should automatically add it to the python binary path already.
I think some python level stuff was on the bashrc
so invoking that solved it. Now i see the kedro viz as expected. Thanks for the support ๐ ๐
Should we still open a issue about user level code errors being hidden in kedro? I feel this led to extra debugging sessions whereas it should have been clear from the beginning that a dependency was missing. Somehow error is being caught somewhere
I think some python level stuff was on the bashrc
so invoking that solved it.
I think it's most likely the virtual envNo I meant the earlier stack traces where No Module named elevenlabs
was not thrown in kedro
We have been fighting this a bit to deal with two conflict requirements:
The issue i had was this:
Traceback (most recent call last): File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/io/core.py", line 159, in from_config class_obj, config = parse_dataset_definition( File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/io/core.py", line 501, in parse_dataset_definition raise DatasetError( kedro.io.core.DatasetError: Class 'projx.models.audio.io.LargeModel' not found, is this a typo? Hint: If you are trying to use a dataset from `kedro-datasets`, make sure that the package is installed in your current environment. You can do so by running `pip install kedro-datasets` or `pip install kedro-datasets[<dataset-group>]` to install `kedro-datasets` along with related dependencies for the specific dataset group. The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/opt/conda/envs/py/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap self.run() File "/opt/conda/envs/py/lib/python3.10/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro_viz/server.py", line 122, in run_server load_and_populate_data( File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro_viz/server.py", line 59, in load_and_populate_data catalog, pipelines, session_store, stats_dict = kedro_data_loader.load_data( File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro_viz/integrations/kedro/data_loader.py", line 172, in load_data return _load_data_helper( File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro_viz/integrations/kedro/data_loader.py", line 101, in _load_data_helper catalog = context.catalog File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/framework/context/context.py", line 190, in catalog return self._get_catalog() File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/framework/context/context.py", line 234, in _get_catalog catalog: DataCatalog = settings.DATA_CATALOG_CLASS.from_config( File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/io/data_catalog.py", line 330, in from_config datasets[ds_name] = AbstractDataset.from_config( File "/opt/conda/envs/py/lib/python3.10/site-packages/kedro/io/core.py", line 163, in from_config raise DatasetError( kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'narrator#lam': Class 'projx.models.audio.io.LargeModel' not found, is this a typo? Hint: If you are trying to use a dataset from `kedro-datasets`, make sure that the package is installed in your current environment. You can do so by running `pip install kedro-datasets` or `pip install kedro-datasets[<dataset-group>]` to install `kedro-datasets` along with related dependencies for the specific dataset group.so this is what i saw and when i ran the import statement in the python as you suggested, i got this:
>>> from projx.models.audio.io import LargeModel Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/app/src/projx/models/audio/__init__.py", line 1, in <module> from .base import LAM File "/app/src/projx/models/audio/base.py", line 1, in <module> from elevenlabs.client import ElevenLabs ModuleNotFoundError: No module named 'elevenlabs'so I sort of was expecting kedro to show this in the first place. If there is an open issue about it happy to comment it there, otherwise i'd open a new one
so I sort of was expecting kedro to show this in the first place. If there is an open issue about it happy to comment it there, otherwise i'd open a new one
I was under the impression that we had fixed this as part of https://github.com/kedro-org/kedro/issues/2943, but maybe this is yet another case we have to handle?
kedro.io.core.DatasetError: An exception occurred when parsing config for dataset 'ingestion.int_typed_companies':
No module named 'pandas'. Please see the documentation on how to install relevant dependencies for kedro_datasets.pandas.ParquetDataset:
https://docs.kedro.org/en/stable/kedro_project_setup/dependencies.html#install-dependencies-related-to-the-data-catalog
pandas.CSVDataset
when I have kedro-datasets
installed but not pandas
(pip uninstall on purpose)Hmm, could it be realted to custom datasets created by the user? This example uses my custom defined dataset definition, perhaps there it doesnt work as expected?
could be, but I am not 100% sure here. The way kedro-dataset structure is usually having a dataset
module and init, and implemnetation file.
some_dataset_module
with from .some_dataset import XYZDataset
, and we also have lazy loading implemented, that could be another reason why we are able to catch error better for kedro-datasets
I have the following one:
dataset_name - init.py -> from .base import LAM - base.py -> Wrapper code around aPI endpoint and defines LAM - io.py -> Reads kedro config and create the LAM instance.So basically module init would try to load the package
elevenlabs
which is defined in the base
and that's the error that was not caught.