[db] Request database optimizations #7602
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The PR introduces a series of changes to make accesses to request DB more efficient.
General changes
The database is given an index oncreated_at. This is because several queries inrequests.pyhaveORDER BY created_atstatements, which can be accelerated by an index.RequestTaskFilternow has asortparameter that can toggle the inclusion ofORDER BY created_atstatement. This allows callers that do not need the results to be sorted to benefit from not sorting the result.RequestTaskFilterhadfieldsparameter introduced 2 days ago. This PR adds appropriatefieldsparameter to callers that do not need all of the parameters (especially request / response bodies) to perform their tasks.get_request_tasks_with_fields_asyncis merged withget_request_tasks_async, allowing the latter function to handle an optionalfieldsparameter if provided. Similarly,get_request_tasksis modified to handle an optionalfieldsparameter.exact_matchfields are added to some queries that acts on a request given arequest_id. Since the API server wants to handle clients submitting a request ID prefix, query functions useWHERE request_id LIKE <prefix>%statement to handle prefixes. However, in cases where we know an exact request ID is supplied, usingWHERE request_id = <id>is more efficient.Case studies of specific codepaths
sky api cancel -a:kill_requests. Since no request IDs are specified, the request IDs are retrieved from DB. This DB call now only returns request IDs (instead of whole requests) and does not sort. I expect this to be the bulk of the efficiency improvement.exact_matchtoTrueonupdate_request. This uses an exact match query (WHERE request_id = <id>instead ofWHERE request_id LIKE <prefix>%) making the operation more efficient.sky logs_tail_log_file. We now establish anexact_request_idat the start of_tail_log_file, and use exact match query making the operation more efficient.Tested (run the relevant ones):
bash format.sh/smoke-test(CI) orpytest tests/test_smoke.py(local)/smoke-test -k test_name(CI) orpytest tests/test_smoke.py::test_name(local)/quicktest-core(CI) orpytest tests/smoke_tests/test_backward_compat.py(local)