Hi everyone
I have been exploring ibis for sometime. I just wanted to understand is there a better way to write the below code in a more optimised fashion
import ibis con = ibis.connect(POSTGRES_CONNECTION_STRING) training_meta_table:ir.Table = con.table("training_metadata") filters = { "customer_ids" : [59] , "queue_names" : ["General Lit - Misclassifications", "MoveDocs-MR"], "start_date" : "2024-09-5 00:00:00", "end_date" : "2024-09-11 00:00:00", "doc_types" : [], "fields" : ["patientFirstName", "patientLastName", "Service Date", "Doctor"] } field_conditions = training_meta_table.fields_present.contains(filters["fields"][0]) | training_meta_table.fields_present.contains(filters["fields"][1]) | training_meta_table.fields_present.contains(filters["fields"][2]) | training_meta_table.fields_present.contains(filters["fields"][3])
just referred it , i could not make much use of it.
if you see we need to kind of iterate through all field conditions(List) present in the filters dictionary and create ibis expression for each of them.
A bit of playing with chatgpt, I think this is the most readable:
import ibis from functools import reduce import operator # Establish connection con = ibis.connect(POSTGRES_CONNECTION_STRING) training_meta = con.table("training_metadata") # Define filters filters = { "customer_ids": [59], "queue_names": ["General Lit - Misclassifications", "MoveDocs-MR"], "start_date": "2024-09-05 00:00:00", "end_date": "2024-09-11 00:00:00", "fields": ["patientFirstName", "patientLastName", "Service Date", "Doctor"] } # Helper function to build filter conditions def build_filters(table, filters): conditions = [ table.customer_id.isin(filters["customer_ids"]), table.queue_name.isin(filters["queue_names"]), table.date.between(filters["start_date"], filters["end_date"]), table.fields_present.overlaps(filters["fields"]) ] return reduce(operator.and_, conditions) # Apply filters and select fields using the helper function result = ( training_meta .filter(build_filters(training_meta, filters)) .select(filters["fields"]) ) # Execute the query (uncomment when ready) # fetched_results = result.execute()
Thanks will test and let you know. But I am curious to know the use of overlaps()
here.
Just a bit of context here - The fields_present column in the table contains comma separated strings , attaching some examples as well here -
Document Date, Dr. Name, Doctor, Sender Free Text, Sender, Sender Logo, Sender Signature Document Type, patientDOB, patientFirstName, patientLastName, providerEntity, referringPhysician, serviceDate
training_meta_table.fields_present.contains(filters["fields"][0])
contains()
comes from ibis string expressions.