Defining Filters in Load Args for Dask ParquetDataset

Hi everyone,

I need some help understanding how to define filters in load_args when loading a ParquetDataset with Dask from the catalog.

My catalog entry would be something like:

data:
  type: dask.ParquetDataset
  filepath: data/
  load_args :
    filters: [('filter_1', '==', 1) or
                ('filter_2', '==', 1) or
                ('filter_3', '==', 1) or
                ('filter_4', '==', 1) ]

I tested this exact syntax for filters in the Python API and while it works there, I cannot seem to find a way to make it work using the catalog, since it raises the error:

kedro.io.core.DatasetError: Failed while loading data from data set 
An error occurred while calling the read_parquet method registered to the pandas backend.
Original Message: too many values to unpack (expected 3)

Join the Kedro community

Defining Filters in Load Args for Dask ParquetDataset