Join the Kedro community

Updated 5 months ago

Error saving dataframe to snowpark table dataset

At a glance

The community member is trying to save a pandas DataFrame to a Snowpark Table dataset object, but is encountering an error that the DataFrame object has no "write" attribute. The comments suggest that the community member should convert the pandas DataFrame to a Snowpark DataFrame before saving it to the Snowpark Table dataset. The recommended approach is to use the session.create_dataframe() method to create a Snowpark DataFrame from the pandas DataFrame, and then save that to the Snowpark Table dataset. However, the community member is having trouble creating the Snowpark session, particularly with regards to fetching the token from the Snowflake container service. There is no explicitly marked answer in the comments.

Hello Team,
I want to save a df back to a Snowpark Table dataset object, but im running into this error

DatasetError: Failed while saving data to data set SnowparkTableDataset(...).
'DataFrame' object has no attribute 'write'
Code snippet in thread, please let me know if there is a way to do this πŸ˜„ Thanks so much!

A
R
6 comments

def push_output_to_snowflake(
    classified_text: pd.DataFrame,
    required_cols: list[str],
) -> pd.DataFrame:
    """ingests classified text and saves into Snowflake

    args:
        x
    return:
        classified_text: pd.DataFrame
    """

    # clean up the classified dataframe
    classified_text = process_dataframe(classified_text, required_cols)
    return classified_text

node(
                func=push_output_to_snowflake,
                inputs=[
                    "combined_classification",
                    "params:post_processing.message_output_cols",
                ],
                outputs="snowflake_message_classification_table",
)

in catalog.yml
test_pipe.snowflake_message_classification_table:
  type: kedro_datasets.snowflake.SnowparkTableDataset
  table_name: x
  database: x
  schema: x
  credentials: snowflake_creds
  save_args:
    mode: append
    column_order: name
    table_type: ''

I see the push_output_to_snowflake function returns a pandas dataframe. Instead can you convert it to Snowpark df before saving it back to Snowflake ?

will try that
so that would be using the Session.builder.configs to create a session and then using write_pandas?

I was suggesting to convert the pandas df to snowpark_df in your push_output_to_snowflake function as SnowparkTableDataset loads and saves Snowpark dataframes. So the function will return a df of type snowflake.snowpark.DataFrame . You can try using -
snowpark_df = session.create_dataframe(pandas_df, schema=schema)

Any recommended way to create a session?
im trying to use Session.builder.configs but that doesnt seem to work with omegaconfig resolver and oauth (as I need to fetch the token from snoflake container service)

Hi Akshata, Session.builder.configs(connection_params).create() should create the session. Could you tell us what is not working or if there are any errors that are thrown ? I did not fully understand if the issue is with getting the token or something else.

Also, I suppose your application should have already created a singleton session somewhere, unless this is the first time you are creating a session.

Add a reply
Sign up and join the conversation on Slack