-
Notifications
You must be signed in to change notification settings - Fork 11
feat: Pre-download bundle uploads before lock to reduce contention #678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #678 +/- ##
=======================================
Coverage 92.43% 92.43%
=======================================
Files 1297 1297
Lines 47730 47789 +59
Branches 1606 1606
=======================================
+ Hits 44119 44175 +56
- Misses 3302 3305 +3
Partials 309 309
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
drazisil-codecov
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AI Code Review
Summary
This PR reduces lock contention by pre-downloading bundle upload files before acquiring the Redis lock. The implementation is well-designed with proper cleanup handling and error cases.
Status: APPROVED with minor suggestions
Issues Found
Issue 1: Misleading comment (Low Priority)
File: apps/worker/services/bundle_analysis/report.py
Function: extract_bundle_name_from_file
Problem: Comment states "Stop after reading the first ~100 events" but no limit is implemented. The loop continues until bundleName is found or EOF.
Suggested Fix:
# Option A: Implement the limit
for count, (prefix, event, value) in enumerate(ijson.parse(f)):
if prefix == "bundleName":
return value
if count > 100:
break
# Option B: Fix the comment
# Returns early when bundleName is found; returns None if not presentAction Required: FIX_COMMENT_OR_IMPLEMENT_LIMIT
Issue 2: Unused code added (Low Priority, Intentional)
Files: apps/worker/services/lock_manager.py, apps/worker/services/bundle_analysis/report.py
Problem: lock_key_suffix parameter and extract_bundle_name_from_file helper are added but not used in this PR.
Note: PR description indicates this is intentional groundwork for future per-bundle parallelism.
Action Required: NONE (defer to author discretion)
Verification Checklist
- Cleanup handling is correct (
finallyblocks properly clean up temp files) - Race conditions addressed (each worker creates own temp file copy)
- Fallback behavior works when pre-downloaded file is missing
- Error handling in
_pre_download_upload_fileis comprehensive - Test coverage is adequate
For AI Agents
{
"review_result": "APPROVED_WITH_SUGGESTIONS",
"blocking_issues": [],
"non_blocking_issues": [
{
"id": "misleading-comment",
"severity": "low",
"file": "apps/worker/services/bundle_analysis/report.py",
"function": "extract_bundle_name_from_file",
"action": "Either implement 100-event limit or update comment to reflect actual behavior",
"lines": "24-25"
}
],
"approved": true
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might be able to take advantage of creating a class that handles this file creation and removal.
that way, we can use the pattern (please think of a better class name)
class BundleAnalysisFile:
def __enter__(self):
# do all the downloading stuff here
def __exit__(self):
# do all the temp file deletion here
with BundleAnalysisFile as ba:
with Lockwhateverdo the lock stuff:
do stuffusing a contextlib is also a good solution if we don't want to make a class
| "Failed to pre-download upload file", | ||
| extra={"repoid": repoid, "upload_id": upload_id, "error": str(e)}, | ||
| ) | ||
| if os.path.exists(local_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps this should be in finally
| ) | ||
| if os.path.exists(local_path): | ||
| os.remove(local_path) | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we just return None outside this loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
| # Note: We still use per-commit locking because all bundles share the same | ||
| # SQLite report file, which requires serialized access to prevent data loss. | ||
| pre_downloaded_path = self._pre_download_upload_file(db_session, repoid, params) | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pre-downloaded file leaks if LockManager construction fails
Medium Severity
The pre-downloaded file cleanup in the finally block (lines 172-175) won't run if LockManager() construction fails at lines 129-133. Since the try block starts at line 135, any exception thrown during LockManager initialization (e.g., from get_redis_connection() failing) would leave the pre-downloaded temp file orphaned on disk. The file download and cleanup need to be wrapped in the same try/finally scope.


Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. In 2022 this entity acquired Codecov and as result Sentry is going to need some rights from me in order to utilize my contributions in this PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.
Note
Medium Risk
Touches bundle-analysis processing flow and temp-file lifecycle around locked sections; incorrect handling could cause extra retries, leaked temp files, or inconsistent report ingestion under concurrency.
Overview
Reduces bundle-analysis lock contention by pre-downloading the upload payload before acquiring the commit-level Redis lock, then passing the local path through to
BundleAnalysisReportService.process_uploadto skip the in-lock GCS download and clean up temp files safely.Adds groundwork for more granular locking via
LockManager(lock_key_suffix=...), plus a streamingextract_bundle_name_from_filehelper (with early-termination) intended to support per-bundle lock keys; updates/adds unit tests to cover the new pre-download and lock-name behaviors.Written by Cursor Bugbot for commit 7dfb50f. This will update automatically on new commits. Configure here.