Skip to content

Commit

Permalink
Modify validity check + clean up creds (#209)
Browse files Browse the repository at this point in the history
* modify creds

* clean up creds

* validity check on existing DBs

* remove redundant query alternative

* correct qn/sql to give non-empty result

* remove unused functions

* lint

* change dataset file name

* revert to original sql
  • Loading branch information
wendy-aw authored Aug 13, 2024
1 parent bc283e5 commit d00ffdf
Show file tree
Hide file tree
Showing 8 changed files with 259 additions and 132 deletions.
2 changes: 1 addition & 1 deletion data/instruct_basic_bigquery.csv
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ derm_treatment,bigquery,basic_group_order_limit,"SELECT ins_type, AVG(height_cm)
derm_treatment,bigquery,basic_group_order_limit,"SELECT specialty, COUNT(*) AS num_doctors FROM derm_treatment.doctors GROUP BY specialty ORDER BY num_doctors DESC NULLS FIRST LIMIT 2;",What are the top 2 specialties by number of doctors? Return the specialty and number of doctors.
derm_treatment,bigquery,basic_left_join,"SELECT p.patient_id, p.first_name, p.last_name FROM derm_treatment.patients AS p LEFT JOIN derm_treatment.treatments AS t ON p.patient_id = t.patient_id WHERE t.patient_id IS NULL;","Return the patient IDs, first names and last names of patients who have not received any treatments."
derm_treatment,bigquery,basic_left_join,"SELECT d.drug_id, d.drug_name FROM derm_treatment.drugs AS d LEFT JOIN derm_treatment.treatments AS t ON d.drug_id = t.drug_id WHERE t.drug_id IS NULL;",Return the drug IDs and names of drugs that have not been used in any treatments.
ewallet,bigquery,basic_join_date_group_order_limit,"SELECT m.name AS merchant_name, COUNT(t.txid) AS total_transactions, SUM(t.amount) AS total_amount FROM ewallet.merchants AS m JOIN ewallet.wallet_transactions_daily AS t ON m.mid = t.receiver_id WHERE t.receiver_type = 1 AND t.created_at >= CURRENT_DATE - INTERVAL '30' DAY GROUP BY merchant_name ORDER BY total_amount DESC NULLS FIRST LIMIT 5;","Who are the top 5 merchants (receiver type 1) by total transaction amount in the past 30 days (inclusive of 30 days ago)? Return the merchant name, total number of transactions, and total transaction amount."
ewallet,bigquery,basic_join_date_group_order_limit,"SELECT m.name AS merchant_name, COUNT(t.txid) AS total_transactions, SUM(t.amount) AS total_amount FROM ewallet.merchants AS m JOIN ewallet.wallet_transactions_daily AS t ON m.mid = t.receiver_id WHERE t.receiver_type = 1 AND t.created_at >= CURRENT_DATE - INTERVAL '150' DAY GROUP BY merchant_name ORDER BY total_amount DESC NULLS FIRST LIMIT 2;","Who are the top 2 merchants (receiver type 1) by total transaction amount in the past 150 days (inclusive of 150 days ago)? Return the merchant name, total number of transactions, and total transaction amount."
ewallet,bigquery,basic_join_date_group_order_limit,"SELECT TIMESTAMP_TRUNC(t.created_at, MONTH) AS MONTH, COUNT(DISTINCT t.sender_id) AS active_users FROM ewallet.wallet_transactions_daily AS t JOIN ewallet.users AS u ON t.sender_id = u.uid WHERE t.sender_type = 0 AND t.status = 'success' AND u.status = 'active' AND t.created_at >= '2023-01-01' AND t.created_at < '2024-01-01' GROUP BY MONTH ORDER BY MONTH NULLS LAST;","How many distinct active users sent money per month in 2023? Return the number of active users per month (as a date), starting from the earliest date. Do not include merchants in the query. Only include successful transactions."
ewallet,bigquery,basic_join_group_order_limit,"SELECT c.code AS coupon_code, COUNT(t.txid) AS redemption_count, SUM(t.amount) AS total_discount FROM ewallet.coupons AS c JOIN ewallet.wallet_transactions_daily AS t ON c.cid = t.coupon_id GROUP BY coupon_code ORDER BY redemption_count DESC NULLS FIRST LIMIT 3;","What are the top 3 most frequently used coupon codes? Return the coupon code, total number of redemptions, and total amount redeemed."
ewallet,bigquery,basic_join_group_order_limit,"SELECT u.country, COUNT(DISTINCT t.sender_id) AS user_count, SUM(t.amount) AS total_amount FROM ewallet.users AS u JOIN ewallet.wallet_transactions_daily AS t ON u.uid = t.sender_id WHERE t.sender_type = 0 GROUP BY u.country ORDER BY total_amount DESC NULLS FIRST LIMIT 5;","Which are the top 5 countries by total transaction amount sent by users, sender_type = 0? Return the country, number of distinct users who sent, and total transaction amount."
Expand Down
2 changes: 1 addition & 1 deletion data/instruct_basic_mysql.csv
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ derm_treatment,mysql,basic_group_order_limit,"SELECT ins_type, AVG(height_cm) AS
derm_treatment,mysql,basic_group_order_limit,"SELECT specialty, COUNT(*) AS num_doctors FROM doctors GROUP BY specialty ORDER BY CASE WHEN num_doctors IS NULL THEN 1 ELSE 0 END DESC, num_doctors DESC LIMIT 2;",What are the top 2 specialties by number of doctors? Return the specialty and number of doctors.
derm_treatment,mysql,basic_left_join,"SELECT p.patient_id, p.first_name, p.last_name FROM patients AS p LEFT JOIN treatments AS t ON p.patient_id = t.patient_id WHERE t.patient_id IS NULL;","Return the patient IDs, first names and last names of patients who have not received any treatments."
derm_treatment,mysql,basic_left_join,"SELECT d.drug_id, d.drug_name FROM drugs AS d LEFT JOIN treatments AS t ON d.drug_id = t.drug_id WHERE t.drug_id IS NULL;",Return the drug IDs and names of drugs that have not been used in any treatments.
ewallet,mysql,basic_join_date_group_order_limit,"SELECT m.name AS merchant_name, COUNT(t.txid) AS total_transactions, SUM(t.amount) AS total_amount FROM ewallet.merchants AS m JOIN ewallet.wallet_transactions_daily AS t ON m.mid = t.receiver_id WHERE t.receiver_type = 1 AND t.created_at >= CURRENT_DATE - INTERVAL 30 DAY GROUP BY m.name ORDER BY total_amount DESC LIMIT 5;","Who are the top 5 merchants (receiver type 1) by total transaction amount in the past 30 days (inclusive of 30 days ago)? Return the merchant name, total number of transactions, and total transaction amount."
ewallet,mysql,basic_join_date_group_order_limit,"SELECT m.name AS merchant_name, COUNT(t.txid) AS total_transactions, SUM(t.amount) AS total_amount FROM ewallet.merchants AS m JOIN ewallet.wallet_transactions_daily AS t ON m.mid = t.receiver_id WHERE t.receiver_type = 1 AND t.created_at >= CURRENT_DATE - INTERVAL 150 DAY GROUP BY m.name ORDER BY total_amount DESC LIMIT 2;","Who are the top 2 merchants (receiver type 1) by total transaction amount in the past 150 days (inclusive of 150 days ago)? Return the merchant name, total number of transactions, and total transaction amount."
ewallet,mysql,basic_join_date_group_order_limit,"SELECT DATE_FORMAT(t.created_at, '%Y-%m-01') AS month, COUNT(DISTINCT t.sender_id) AS active_users FROM ewallet.wallet_transactions_daily AS t JOIN ewallet.users AS u ON t.sender_id = u.uid WHERE t.sender_type = 0 AND t.status = 'success' AND u.status = 'active' AND t.created_at >= '2023-01-01' AND t.created_at < '2024-01-01' GROUP BY month ORDER BY month;","How many distinct active users sent money per month in 2023? Return the number of active users per month (as a date), starting from the earliest date. Do not include merchants in the query. Only include successful transactions."
ewallet,mysql,basic_join_group_order_limit,"SELECT c.code AS coupon_code, COUNT(t.txid) AS redemption_count, SUM(t.amount) AS total_discount FROM coupons AS c JOIN wallet_transactions_daily AS t ON c.cid = t.coupon_id GROUP BY c.code ORDER BY CASE WHEN redemption_count IS NULL THEN 1 ELSE 0 END DESC, redemption_count DESC LIMIT 3;","What are the top 3 most frequently used coupon codes? Return the coupon code, total number of redemptions, and total amount redeemed."
ewallet,mysql,basic_join_group_order_limit,"SELECT u.country, COUNT(DISTINCT t.sender_id) AS user_count, SUM(t.amount) AS total_amount FROM users AS u JOIN wallet_transactions_daily AS t ON u.uid = t.sender_id WHERE t.sender_type = 0 GROUP BY u.country ORDER BY CASE WHEN total_amount IS NULL THEN 1 ELSE 0 END DESC, total_amount DESC LIMIT 5;","Which are the top 5 countries by total transaction amount sent by users, sender_type = 0? Return the country, number of distinct users who sent, and total transaction amount."
Expand Down
2 changes: 1 addition & 1 deletion data/instruct_basic_postgres.csv
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ derm_treatment,basic_group_order_limit,"What are the top 3 insurance types by av
derm_treatment,basic_group_order_limit,What are the top 2 specialties by number of doctors? Return the specialty and number of doctors.,"SELECT specialty, COUNT(*) AS num_doctors FROM doctors GROUP BY specialty ORDER BY num_doctors DESC LIMIT 2"
derm_treatment,basic_left_join,"Return the patient IDs, first names and last names of patients who have not received any treatments.","SELECT p.patient_id, p.first_name, p.last_name FROM patients p LEFT JOIN treatments t ON p.patient_id = t.patient_id WHERE t.patient_id IS NULL"
derm_treatment,basic_left_join,Return the drug IDs and names of drugs that have not been used in any treatments.,"SELECT d.drug_id, d.drug_name FROM drugs d LEFT JOIN treatments t ON d.drug_id = t.drug_id WHERE t.drug_id IS NULL"
ewallet,basic_join_date_group_order_limit,"Who are the top 5 merchants (receiver type 1) by total transaction amount in the past 30 days (inclusive of 30 days ago)? Return the merchant name, total number of transactions, and total transaction amount.","SELECT m.name AS merchant_name, COUNT(t.txid) AS total_transactions, SUM(t.amount) AS total_amount FROM consumer_div.merchants m JOIN consumer_div.wallet_transactions_daily t ON m.mid = t.receiver_id WHERE t.receiver_type = 1 AND t.created_at >= CURRENT_DATE - INTERVAL '30 days' GROUP BY m.name ORDER BY total_amount DESC LIMIT 5"
ewallet,basic_join_date_group_order_limit,"Who are the top 2 merchants (receiver type 1) by total transaction amount in the past 150 days (inclusive of 150 days ago)? Return the merchant name, total number of transactions, and total transaction amount.","SELECT m.name AS merchant_name, COUNT(t.txid) AS total_transactions, SUM(t.amount) AS total_amount FROM consumer_div.merchants m JOIN consumer_div.wallet_transactions_daily t ON m.mid = t.receiver_id WHERE t.receiver_type = 1 AND t.created_at >= CURRENT_DATE - INTERVAL '150 days' GROUP BY m.name ORDER BY total_amount DESC LIMIT 2"
ewallet,basic_join_date_group_order_limit,"How many distinct active users sent money per month in 2023? Return the number of active users per month (as a date), starting from the earliest date. Do not include merchants in the query. Only include successful transactions.","SELECT DATE_TRUNC('month', t.created_at) AS MONTH, COUNT(DISTINCT t.sender_id) AS active_users FROM consumer_div.wallet_transactions_daily t JOIN consumer_div.users u ON t.sender_id = u.uid WHERE t.sender_type = 0 AND t.status = 'success' AND u.status = 'active' AND t.created_at >= '2023-01-01' AND t.created_at < '2024-01-01' GROUP BY MONTH ORDER BY MONTH"
ewallet,basic_join_group_order_limit,"What are the top 3 most frequently used coupon codes? Return the coupon code, total number of redemptions, and total amount redeemed.","SELECT c.code AS coupon_code, COUNT(t.txid) AS redemption_count, SUM(t.amount) AS total_discount FROM consumer_div.coupons c JOIN consumer_div.wallet_transactions_daily t ON c.cid = t.coupon_id GROUP BY c.code ORDER BY redemption_count DESC LIMIT 3"
ewallet,basic_join_group_order_limit,"Which are the top 5 countries by total transaction amount sent by users, sender_type = 0? Return the country, number of distinct users who sent, and total transaction amount.","SELECT u.country, COUNT(DISTINCT t.sender_id) AS user_count, SUM(t.amount) AS total_amount FROM consumer_div.users u JOIN consumer_div.wallet_transactions_daily t ON u.uid = t.sender_id WHERE t.sender_type = 0 GROUP BY u.country ORDER BY total_amount DESC LIMIT 5"
Expand Down
2 changes: 1 addition & 1 deletion data/instruct_basic_sqlite.csv
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ derm_treatment,sqlite,basic_group_order_limit,"SELECT ins_type, AVG(height_cm) A
derm_treatment,sqlite,basic_group_order_limit,"SELECT specialty, COUNT(*) AS num_doctors FROM doctors GROUP BY specialty ORDER BY CASE WHEN num_doctors IS NULL THEN 1 ELSE 0 END DESC, num_doctors DESC LIMIT 2;",What are the top 2 specialties by number of doctors? Return the specialty and number of doctors.
derm_treatment,sqlite,basic_left_join,"SELECT p.patient_id, p.first_name, p.last_name FROM patients AS p LEFT JOIN treatments AS t ON p.patient_id = t.patient_id WHERE t.patient_id IS NULL;","Return the patient IDs, first names and last names of patients who have not received any treatments."
derm_treatment,sqlite,basic_left_join,"SELECT d.drug_id, d.drug_name FROM drugs AS d LEFT JOIN treatments AS t ON d.drug_id = t.drug_id WHERE t.drug_id IS NULL;",Return the drug IDs and names of drugs that have not been used in any treatments.
ewallet,sqlite,basic_join_date_group_order_limit,"SELECT m.name AS merchant_name, COUNT(t.txid) AS total_transactions, SUM(t.amount) AS total_amount FROM merchants AS m JOIN wallet_transactions_daily AS t ON m.mid = t.receiver_id WHERE t.receiver_type = 1 AND t.created_at >= DATE('now', '-30 days') GROUP BY m.name ORDER BY total_amount DESC LIMIT 5;","Who are the top 5 merchants (receiver type 1) by total transaction amount in the past 30 days (inclusive of 30 days ago)? Return the merchant name, total number of transactions, and total transaction amount."
ewallet,sqlite,basic_join_date_group_order_limit,"SELECT m.name AS merchant_name, COUNT(t.txid) AS total_transactions, SUM(t.amount) AS total_amount FROM merchants AS m JOIN wallet_transactions_daily AS t ON m.mid = t.receiver_id WHERE t.receiver_type = 1 AND t.created_at >= DATE('now', '-150 days') GROUP BY m.name ORDER BY total_amount DESC LIMIT 2;","Who are the top 2 merchants (receiver type 1) by total transaction amount in the past 150 days (inclusive of 150 days ago)? Return the merchant name, total number of transactions, and total transaction amount."
ewallet,sqlite,basic_join_date_group_order_limit,"SELECT strftime('%Y-%m', t.created_at) AS month, COUNT(DISTINCT t.sender_id) AS active_users FROM wallet_transactions_daily AS t JOIN users AS u ON t.sender_id = u.uid WHERE t.sender_type = 0 AND t.status = 'success' AND u.status = 'active' AND t.created_at >= '2023-01-01' AND t.created_at < '2024-01-01' GROUP BY month ORDER BY month;","How many distinct active users sent money per month in 2023? Return the number of active users per month (as a date), starting from the earliest date. Do not include merchants in the query. Only include successful transactions."
ewallet,sqlite,basic_join_group_order_limit,"SELECT c.code AS coupon_code, COUNT(t.txid) AS redemption_count, SUM(t.amount) AS total_discount FROM coupons AS c JOIN wallet_transactions_daily AS t ON c.cid = t.coupon_id GROUP BY c.code ORDER BY CASE WHEN redemption_count IS NULL THEN 1 ELSE 0 END DESC, redemption_count DESC LIMIT 3;","What are the top 3 most frequently used coupon codes? Return the coupon code, total number of redemptions, and total amount redeemed."
ewallet,sqlite,basic_join_group_order_limit,"SELECT u.country, COUNT(DISTINCT t.sender_id) AS user_count, SUM(t.amount) AS total_amount FROM users AS u JOIN wallet_transactions_daily AS t ON u.uid = t.sender_id WHERE t.sender_type = 0 GROUP BY u.country ORDER BY CASE WHEN total_amount IS NULL THEN 1 ELSE 0 END DESC, total_amount DESC LIMIT 5;","Which are the top 5 countries by total transaction amount sent by users, sender_type = 0? Return the country, number of distinct users who sent, and total transaction amount."
Expand Down
2 changes: 1 addition & 1 deletion data/instruct_basic_tsql.csv
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ derm_treatment,tsql,basic_group_order_limit,"SELECT TOP 3 ins_type, AVG(height_c
derm_treatment,tsql,basic_group_order_limit,"SELECT TOP 2 specialty, COUNT(*) AS num_doctors FROM doctors GROUP BY specialty ORDER BY COUNT(*) DESC;",What are the top 2 specialties by number of doctors? Return the specialty and number of doctors.
derm_treatment,tsql,basic_left_join,"SELECT p.patient_id, p.first_name, p.last_name FROM patients AS p LEFT JOIN treatments AS t ON p.patient_id = t.patient_id WHERE t.patient_id IS NULL;","Return the patient IDs, first names and last names of patients who have not received any treatments."
derm_treatment,tsql,basic_left_join,"SELECT d.drug_id, d.drug_name FROM drugs AS d LEFT JOIN treatments AS t ON d.drug_id = t.drug_id WHERE t.drug_id IS NULL;",Return the drug IDs and names of drugs that have not been used in any treatments.
ewallet,tsql,basic_join_date_group_order_limit,"SELECT TOP 5 m.name AS merchant_name, COUNT(t.txid) AS total_transactions, SUM(t.amount) AS total_amount FROM consumer_div.merchants AS m JOIN consumer_div.wallet_transactions_daily AS t ON m.mid = t.receiver_id WHERE t.receiver_type = 1 AND t.created_at >= DATEADD(DAY, -30, GETDATE()) GROUP BY m.name ORDER BY SUM(t.amount) DESC;","Who are the top 5 merchants (receiver type 1) by total transaction amount in the past 30 days (inclusive of 30 days ago)? Return the merchant name, total number of transactions, and total transaction amount."
ewallet,tsql,basic_join_date_group_order_limit,"SELECT TOP 2 m.name AS merchant_name, COUNT(t.txid) AS total_transactions, SUM(t.amount) AS total_amount FROM consumer_div.merchants AS m JOIN consumer_div.wallet_transactions_daily AS t ON m.mid = t.receiver_id WHERE t.receiver_type = 1 AND t.created_at >= DATEADD(DAY, -150, GETDATE()) GROUP BY m.name ORDER BY SUM(t.amount) DESC;","Who are the top 2 merchants (receiver type 1) by total transaction amount in the past 150 days (inclusive of 150 days ago)? Return the merchant name, total number of transactions, and total transaction amount."
ewallet,tsql,basic_join_date_group_order_limit,"SELECT DATEFROMPARTS(YEAR(t.created_at), MONTH(t.created_at), 1) AS month, COUNT(DISTINCT t.sender_id) AS active_users FROM consumer_div.wallet_transactions_daily AS t JOIN consumer_div.users AS u ON t.sender_id = u.uid WHERE t.sender_type = 0 AND t.status = 'success' AND u.status = 'active' AND t.created_at >= '2023-01-01' AND t.created_at < '2024-01-01' GROUP BY DATEFROMPARTS(YEAR(t.created_at), MONTH(t.created_at), 1) ORDER BY month;","How many distinct active users sent money per month in 2023? Return the number of active users per month (as a date), starting from the earliest date. Do not include merchants in the query. Only include successful transactions."
ewallet,tsql,basic_join_group_order_limit,"SELECT TOP 3 c.code AS coupon_code, COUNT(t.txid) AS redemption_count, SUM(t.amount) AS total_discount FROM consumer_div.coupons AS c JOIN consumer_div.wallet_transactions_daily AS t ON c.cid = t.coupon_id GROUP BY c.code ORDER BY COUNT(t.txid) DESC;","What are the top 3 most frequently used coupon codes? Return the coupon code, total number of redemptions, and total amount redeemed."
ewallet,tsql,basic_join_group_order_limit,"SELECT TOP 5 u.country, COUNT(DISTINCT t.sender_id) AS user_count, SUM(t.amount) AS total_amount FROM consumer_div.users AS u JOIN consumer_div.wallet_transactions_daily AS t ON u.uid = t.sender_id WHERE t.sender_type = 0 GROUP BY u.country ORDER BY SUM(t.amount) DESC;","Which are the top 5 countries by total transaction amount sent by users, sender_type = 0? Return the country, number of distinct users who sent, and total transaction amount."
Expand Down
Loading

0 comments on commit d00ffdf

Please sign in to comment.