Skip to content

fix: keep failed-upload cleanup inside the upload worker#5249

Open
Ma77Ball wants to merge 6 commits into
apache:mainfrom
Ma77Ball:fix/deleteObjectCalledTwice
Open

fix: keep failed-upload cleanup inside the upload worker#5249
Ma77Ball wants to merge 6 commits into
apache:mainfrom
Ma77Ball:fix/deleteObjectCalledTwice

Conversation

@Ma77Ball
Copy link
Copy Markdown
Contributor

@Ma77Ball Ma77Ball commented May 27, 2026

What changes were proposed in this PR?

What Caused the Issue:
LargeBinaryOutputStream looked up the S3 client twice: once in the upload worker (correct), and once again in close() during failed-upload cleanup. When a test left a stream unclosed, Python's GC eventually called del → close() → the second lookup, but by then a different test was active, so the cleanup hit the wrong test's mock_s3 and broke its assert_called_once_with.

Proposed Fix

  • Move s3.delete_object(...) from _cleanup_failed_upload() into the upload worker, reusing the s3 client already captured by the closure that did the upload.
  • Drop the _cleanup_failed_upload() method and the call to it from close(); the worker now handles cleanup before recording the exception.
  • close() and __del__ no longer call back into large_binary_manager, so a finalizer firing under a later test's monkey-patches cannot reach the wrong S3 client.

Any related issues, documentation, or discussions?

Closes: #5245 Follow-up to #4707; surfaced on the 3.12 leg of https://github.com/apache/texera/actions/runs/26481776334/job/77980417021.

How was this PR tested?

  • Ran ruff format and ruff check over amber/src/main/python and amber/src/test/python (clean).
  • Existing tests in test_large_binary_output_stream.py still cover the relevant paths: test_close_handles_upload_error, test_delete_object_failure_is_swallowed, and test_write_after_upload_error_raises_error.
  • Simplified test_write_after_upload_error_raises_error back to inline form and removed the _drained helper, both no longer needed once cleanup is structurally contained.

Was this PR authored or co-authored using generative AI tooling?

Co-authored with Claude Opus 4.7 in compliance with ASF

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 48.94%. Comparing base (d8c254c) to head (726b676).

Additional details and impacted files
@@             Coverage Diff              @@
##               main    #5249      +/-   ##
============================================
- Coverage     48.95%   48.94%   -0.02%     
+ Complexity     2377     2373       -4     
============================================
  Files          1048     1048              
  Lines         40270    40269       -1     
  Branches       4272     4272              
============================================
- Hits          19714    19708       -6     
- Misses        19402    19405       +3     
- Partials       1154     1156       +2     
Flag Coverage Δ *Carryforward flag
access-control-service 39.53% <ø> (ø) Carriedforward from 9dd861f
agent-service 33.76% <ø> (ø) Carriedforward from 9dd861f
amber 51.53% <ø> (-0.04%) ⬇️ Carriedforward from 9dd861f
computing-unit-managing-service 0.00% <ø> (ø) Carriedforward from 9dd861f
config-service 0.00% <ø> (ø) Carriedforward from 9dd861f
file-service 37.99% <ø> (ø) Carriedforward from 9dd861f
frontend 40.64% <ø> (ø) Carriedforward from 9dd861f
python 90.79% <100.00%> (-0.01%) ⬇️
workflow-compiling-service 56.81% <ø> (ø) Carriedforward from 9dd861f

*This pull request uses carry forward flags. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Ma77Ball
Copy link
Copy Markdown
Contributor Author

/request-review @Xiao-zhen-Liu

@github-actions github-actions Bot requested a review from Xiao-zhen-Liu May 27, 2026 19:47
Removed comment about reusing S3 for cleanup in upload worker.

Signed-off-by: Matthew B. <mgball@uci.edu>
@Xiao-zhen-Liu Xiao-zhen-Liu requested review from kunwp1 and removed request for Xiao-zhen-Liu May 28, 2026 00:23
@Xiao-zhen-Liu
Copy link
Copy Markdown
Contributor

Requesting review from @kunwp1 as he is maintaining these code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky test: test_delete_object_failure_is_swallowed fails on Python 3.12.13 CI runners

3 participants