Join the Kedro community

Updated 3 months ago

Leveraging Delta Lake Functionality in Kedro Datasets

At a glance

The community members are using the databricks.ManagedTableDataset dataset, which works with Databricks tables using Delta. They expected that dataset.save would use Delta functions like WhenNotMatch or WhenMatch, but they found that a merge in an SQL query is hardcoded instead. The community members are wondering if there is a reason for this and if they should propose a pull request (PR) to address it. The comments indicate that the community members think a PR would be welcome, and they plan to work on it, though they will need some time.

Hello Kedro community,👋

We had a question in our team:
We are using currently dataset databricks.ManagedTableDataset (that are working with databricks table using delta).

We though that dataset.save would use delta function (ex: WhenNotMAtch or when Match), but it is not the case (in source code, we saw that a merge in a sql query is hardcoded).

Is it a reason for that? Is it a PR we should propose?

Thank you and good week-end 🙂

d
T
2 comments

This is an evolving dataset, a PR would be welcome!

Ok, that's what we thought. We ll do it (we ll need some time :))

Add a reply
Sign up and join the conversation on Slack