Join the Kedro community

Updated 5 months ago

Dataset format not respecting png argument

At a glance

The community member has a dataset in their catalog that does not respect the format: png argument, and instead saves the files as text files. They are looking for a hypothesis on why this might be the case, as the documentation suggests it should work. The comments suggest verifying the behavior with just PartitionedDataset without MLflowArtifactDataset, and potentially reporting the issue on the #C03RKPCLYGY channel if it is related to kedro-mlflow. Another community member notes that the matplotlib dataset is special and can take a list or dictionary of output. The original community member eventually deprioritized the issue and found a workaround by making the keys of the dict[str, Figure] end with ".png".

Hi guys 🙂

I have the following dataset in catalog and it doesn't respect the format: png arg, I guess the whole save_args section of the matplotlib dataset specifically. It doesn't fail, just saves those files as text files instead of png.

Looking at this syntax, do you have a hypothesis why it may be the case?

Based on this section of docs, it should work.

Attachment
Screenshot 2024-09-30 at 5.50.11 PM.png
1
m
N
Y
4 comments

  1. Please verify whether behaviour is the same if you use just PartitionedDataset without MLflowArtifactDataset.
  2. If above works, then ask on #C03RKPCLYGY , because it would be related to kedro-mlflow
  3. Otherwise, let's continue here - might be a bug in Kedro.

Note that matplotlib dataset is kind of a special one, it can take a list or dictionary of output. (Build in partitioning basically).

I've seen people doing that a lot with kedro-mlflow so it should definitely work. Did you have a chance to investigate as suggested above?

Hi !
Unfortunately no, deprioritized this, and I’m achieving what I need by making keys of the dict[str, Figure] that’s being saved end with “.png”.

Add a reply
Sign up and join the conversation on Slack