Hey team, is there a way to use pandas bigquer with parallel runners or is the answer to use ibis again?
Hey team, thank you so for answering my previous questions. Another question on setting up a dataset correctly in the beginning.:
I have table schemas defined in my kedro catalog for bigquery tables. I would like to make sure that I create empty versions of these tables at the beginning of my kedro pipeline based on these schemas.
How can I do this cleanly in kedro?
Hey team, what is a good way of checking whether all the input tables for the nodes that I want to run, are accessible. I am having issues with permissions in BigQuery and testing is cumbersome. Is there a way to run a validation of all external datasets in the catalog?
I was thinking of adding a hook and a metadata tag that identifies the datasets as external.
My main concerns are
Hi folks, I would like to pass the current date to my kedro pipelines at multiple steps. What is the best way to do this?
Hey team, is there any way to pass async functions to nodes?
When you have an expensive operation, is there a good way of loading from an existing dataset? I am trying to check if a certain ID already existst and only perform the functionality of a node when it is new. If it is new, I then add those new entries to the saved dataset so that next time, I don't recalculate it. Effectively caching results.
When using BigQuery Datasets how do you define a default dataset project wide?
Does anybody know why kedro viz might only show 1 node? I have 3 pipelines but only one node from one of the pipelines is shown.
All my pipelines are summed into one default in the registry
Hey everyone, I am trying to define the column dtypes of a CSV dataset because some columns contain IDs that Kedro interprets as floats, but should be interpreted as strings instead. Setting
load_args: dtype: user_id: str save_args: dtype: user_id: str