Skip to content

Commit

Permalink
[batch] reduce CPU use of delete_prev_..._resource_records (#14366)
Browse files Browse the repository at this point in the history
I assessed three query options, but keep in mind that the database is
currently at nearly 100% CPU so these timings are high variance. I'm
honestly pretty surprised that the first one isn't the fastest. It seems
to me like the obvious SQL query and yet it does not perform as well as
those using lateral joins.

Option 2 and Option 3 both use lateral joins. Option 2 uses a LEFT JOIN
followed by a `WHERE ... IS NOT NULL` which is definitionally an `INNER
JOIN`. For that reason, I also explored Option 3, directly using `INNER
JOIN`, which appears to be very slightly faster and lower variance.

| Option | Timing (mean + stddev, in seconds) |
|--------|------------------------------------|
| 1      | 9.92 +- 2.02                       |
| 2      | 5.35 +- 1.87                       |
| 3      | 5.11 +- 1.15                       |

## Option 1
```
SELECT DISTINCT
  group_resources.batch_id,
  group_resources.update_id,
  group_resources.job_group_id
FROM job_group_inst_coll_cancellable_resources AS group_resources
INNER JOIN job_group_self_and_ancestors AS descendant
   ON descendant.batch_id     = group_resources.batch_id
  AND descendant.job_group_id = group_resources.job_group_id
INNER JOIN job_groups_cancelled AS cancelled
   ON descendant.batch_id     = cancelled.id
  AND descendant.ancestor_id  = cancelled.job_group_id
ORDER BY group_resources.batch_id desc, group_resources.update_id desc, group_resources.job_group_id desc
LIMIT 1000;
```
```
1000 rows in set (11.81 sec)
1000 rows in set (8.91 sec)
1000 rows in set (11.86 sec)
1000 rows in set (7.08 sec)
```
## Option 2
```
SELECT DISTINCT
  group_resources.batch_id,
  group_resources.update_id,
  group_resources.job_group_id
FROM job_group_inst_coll_cancellable_resources AS group_resources
LEFT JOIN LATERAL (
  SELECT
    1 AS cancelled
  FROM job_group_self_and_ancestors AS descendant
  INNER JOIN job_groups_cancelled AS cancelled
     ON descendant.batch_id     = cancelled.id
    AND descendant.ancestor_id  = cancelled.job_group_id
  WHERE descendant.batch_id     = group_resources.batch_id
    AND descendant.job_group_id = group_resources.job_group_id
) AS t ON TRUE
WHERE t.cancelled IS NOT NULL
ORDER BY group_resources.batch_id desc, group_resources.update_id desc, group_resources.job_group_id desc
LIMIT 1000;
```
```
1000 rows in set (3.68 sec)
1000 rows in set (5.24 sec)
1000 rows in set (8.43 sec)
1000 rows in set (4.03 sec)
```
## Option 3
```
SELECT DISTINCT
  group_resources.batch_id,
  group_resources.update_id,
  group_resources.job_group_id
FROM job_group_inst_coll_cancellable_resources AS group_resources
INNER JOIN LATERAL (
  SELECT
    1
  FROM job_group_self_and_ancestors AS descendant
  INNER JOIN job_groups_cancelled AS cancelled
     ON descendant.batch_id = cancelled.id
    AND descendant.ancestor_id = cancelled.job_group_id
  WHERE descendant.batch_id = group_resources.batch_id
    AND descendant.job_group_id = group_resources.job_group_id
) AS t ON TRUE
ORDER BY group_resources.batch_id desc, group_resources.update_id desc, group_resources.job_group_id desc
LIMIT 1000;
```
```
1000 rows in set (6.74 sec)
1000 rows in set (5.63 sec)
1000 rows in set (3.86 sec)
1000 rows in set (4.20 sec)
```
  • Loading branch information
danking authored Feb 27, 2024
1 parent 5900dd6 commit a325690
Showing 1 changed file with 15 additions and 14 deletions.
29 changes: 15 additions & 14 deletions batch/batch/driver/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -1461,21 +1461,22 @@ async def delete_committed_job_groups_inst_coll_staging_records(db: Database):
async def delete_prev_cancelled_job_group_cancellable_resources_records(db: Database):
targets = db.execute_and_fetchall(
"""
SELECT job_group_inst_coll_cancellable_resources.batch_id,
job_group_inst_coll_cancellable_resources.update_id,
job_group_inst_coll_cancellable_resources.job_group_id
FROM job_group_inst_coll_cancellable_resources
LEFT JOIN LATERAL (
SELECT 1 AS cancelled
FROM job_group_self_and_ancestors
INNER JOIN job_groups_cancelled
ON job_group_self_and_ancestors.batch_id = job_groups_cancelled.id AND
job_group_self_and_ancestors.ancestor_id = job_groups_cancelled.job_group_id
WHERE job_group_inst_coll_cancellable_resources.batch_id = job_group_self_and_ancestors.batch_id AND
job_group_inst_coll_cancellable_resources.job_group_id = job_group_self_and_ancestors.job_group_id
SELECT DISTINCT
group_resources.batch_id,
group_resources.update_id,
group_resources.job_group_id
FROM job_group_inst_coll_cancellable_resources AS group_resources
INNER JOIN LATERAL (
SELECT
1
FROM job_group_self_and_ancestors AS descendant
INNER JOIN job_groups_cancelled AS cancelled
ON descendant.batch_id = cancelled.id
AND descendant.ancestor_id = cancelled.job_group_id
WHERE descendant.batch_id = group_resources.batch_id
AND descendant.job_group_id = group_resources.job_group_id
) AS t ON TRUE
WHERE t.cancelled IS NOT NULL
GROUP BY job_group_inst_coll_cancellable_resources.batch_id, job_group_inst_coll_cancellable_resources.update_id, job_group_inst_coll_cancellable_resources.job_group_id
ORDER BY group_resources.batch_id desc, group_resources.update_id desc, group_resources.job_group_id desc
LIMIT 1000;
""",
query_name='find_cancelled_cancellable_resources_records_to_delete',
Expand Down

0 comments on commit a325690

Please sign in to comment.