-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify validity check + clean up creds #209
Conversation
translate_sql_dialect.py
Outdated
from tqdm import tqdm | ||
from eval.eval import get_all_minimal_queries | ||
import os | ||
|
||
tqdm.pandas() | ||
|
||
dataset_file = ( | ||
"data/instruct_advanced_postgres.csv" # Postgres dataset file to translate | ||
"data/instruct_advanced_postgresTEST.csv" # Postgres dataset file to translate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo spotted!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! i'll revert to the original
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Wendy! Looks good to me – thanks for the big changes in dialects.py
in particular, that was a very useful blind spot to look at!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just 1 small comment!
data/instruct_advanced_sqlite.csv
Outdated
@@ -54,7 +54,7 @@ Last 30 days = DATE('now', '-30 days') to DATE('now') | |||
PMSPS = per month salesperson signups | |||
To get the total sales amount per salesperson, join the salespersons and sales tables, group by salesperson, and sum the sale_price. Always order results with NULLS last. | |||
Truncate date to week for aggregation." | |||
car_dealership,sqlite,instructions_cte_join,"WITH recent_sales AS (SELECT sp.id, sp.first_name, sp.last_name, COUNT(s.id) AS num_sales FROM salespersons AS sp LEFT JOIN sales AS s ON sp.id = s.salesperson_id WHERE s.sale_date >= DATE('now', '-30 days') GROUP BY sp.id) SELECT id, first_name, last_name, num_sales FROM recent_sales ORDER BY num_sales DESC;WITH recent_sales AS (SELECT sp.id, sp.first_name, sp.last_name, COUNT(s.id) AS num_sales FROM salespersons AS sp LEFT JOIN sales AS s ON sp.id = s.salesperson_id AND s.sale_date >= DATE('now', '-30 days') GROUP BY sp.id, sp.first_name, sp.last_name) SELECT id, first_name, last_name, num_sales FROM recent_sales ORDER BY num_sales DESC;","How many sales did each salesperson make in the past 30 days, inclusive of today's date? Return their ID, first name, last name and number of sales made, ordered from most to least sales.","To get the number of sales made by each salesperson in the past 30 days, join the salespersons and sales tables and filter for sales in the last 30 days.","When using car makes, model names, engine_type, and vin_number, ensure matching is case-insensitive and allows for partial matches using LIKE with wildcards. | |||
car_dealership,sqlite,instructions_cte_join,"WITH recent_sales AS (SELECT sp.id, sp.first_name, sp.last_name, COUNT(s.id) AS num_sales FROM salespersons AS sp LEFT JOIN sales AS s ON sp.id = s.salesperson_id AND s.sale_date >= DATE('now', '-30 days') GROUP BY sp.id) SELECT id, first_name, last_name, num_sales FROM recent_sales ORDER BY num_sales DESC;","How many sales did each salesperson make in the past 30 days, inclusive of today's date? Return their ID, first name, last name and number of sales made, ordered from most to least sales.","To get the number of sales made by each salesperson in the past 30 days, join the salespersons and sales tables and filter for sales in the last 30 days.","When using car makes, model names, engine_type, and vin_number, ensure matching is case-insensitive and allows for partial matches using LIKE with wildcards. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be GROUP BY sp.id, sp.first_name, sp.last_name)
instead of GROUP BY sp.id)
? Since we're selecting 3 columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh sorry i made a mistake. Shouldn't have removed the original because it provided an alternative where zero counts were not included.
On the grouping, I found just grouping by id
producing the exact same results so I thought the other grouping cols was redundant. Is there a fundamental difference in meaning that I'm missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answering my own question: Ohh it's cos of postgres' flexi rules that it still works with grouping only by id
. Should be better practice to add all selected cols in GROUP BY. Lemme revert to original!
dialects.py
to always refer to db_creds_all inutils.creds