Skip to content

feat: add composite upload option for large file writes#1254

Open
mishushakov wants to merge 14 commits intomainfrom
mishushakov/composite-upload
Open

feat: add composite upload option for large file writes#1254
mishushakov wants to merge 14 commits intomainfrom
mishushakov/composite-upload

Conversation

@mishushakov
Copy link
Copy Markdown
Member

@mishushakov mishushakov commented Apr 3, 2026

Summary

  • Adds automatic composite upload for large files (>64MB) in both JS and Python SDKs — write() transparently splits data into 64MB chunks, uploads them in parallel, then composes them server-side using zero-copy concatenation via the new POST /files/compose endpoint
  • Adds /files/compose endpoint to the envd OpenAPI spec with ComposeRequest schema (source_paths, destination, username) and regenerates JS SDK types
  • Files at or below 64MB use the normal single upload path — no user-facing API changes needed
  • Supports gzip compression for chunk uploads
  • Supports both sync and async Python SDKs

Test plan

  • Upload a file >64MB and verify it is chunked and composed correctly
  • Upload a file <64MB and verify normal upload path is used
  • Test with gzip: true on large file uploads
  • Test with both JS and Python SDKs (sync and async)

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@changeset-bot
Copy link
Copy Markdown

changeset-bot bot commented Apr 3, 2026

🦋 Changeset detected

Latest commit: 603f317

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages
Name Type
@e2b/python-sdk Minor
e2b Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@cursor
Copy link
Copy Markdown

cursor bot commented Apr 3, 2026

PR Summary

Medium Risk
Adds a new envd API endpoint and changes SDK upload behavior for large files (parallel chunk uploads + server-side compose), which could impact reliability, timeouts, and disk usage for write operations.

Overview
Adds a new envd API POST /files/compose (with ComposeRequest) to concatenate multiple uploaded parts into a single destination file, and regenerates the JS OpenAPI types accordingly.

Updates the JS SDK Filesystem.write() and the async Python SDK Filesystem.write() to transparently switch to a composite upload for files larger than 64MB: split into 64MB temp chunks under /tmp, upload chunks in parallel (optionally gzip-compressed), then call /files/compose to finalize the write. Includes a changeset bump for @e2b/python-sdk and e2b.

Written by Cursor Bugbot for commit 603f317. This will update automatically on new commits. Configure here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 3, 2026

Package Artifacts

Built from 8c9603e. Download artifacts from this workflow run.

JS SDK (e2b@2.19.1-mishushakov-composite-upload.0):

npm install ./e2b-2.19.1-mishushakov-composite-upload.0.tgz

CLI (@e2b/cli@2.9.1-mishushakov-composite-upload.0):

npm install ./e2b-cli-2.9.1-mishushakov-composite-upload.0.tgz

Python SDK (e2b==2.20.0+mishushakov-composite-upload):

pip install ./e2b-2.20.0+mishushakov.composite.upload-py3-none-any.whl

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Autofix Details

This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.

- Build chunk_paths deterministically before asyncio.gather in async _composite_write
- Use Username type instead of bare string in JS compositeWrite

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mishushakov and others added 5 commits April 3, 2026 13:01
Use already-materialized blob/content instead of re-reading original data.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When data fits in a single chunk, fall through to the normal write path
instead of duplicating the upload logic inside compositeWrite.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the `composite` option from `write()`. Files over 64MB are now
automatically chunked and uploaded via the composite path when the envd
version supports it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When data is an IO object and ≤64MB, to_upload_body() consumes the
stream. Pass the materialized bytes to write_files() instead of the
exhausted IO object.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@mishushakov mishushakov marked this pull request as ready for review April 3, 2026 13:36
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9de5bc1fe8

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

mishushakov and others added 3 commits April 3, 2026 15:59
Composite upload's primary benefit is parallel chunk uploading, which
the sync SDK cannot leverage (sequential HTTP requests negate the
performance advantage). Only the async Python SDK and JS SDK retain
composite upload support via asyncio.gather() and Promise.all().

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
write_files() already calls to_upload_body internally, so the
pre-materialization in write() was unnecessary after removing the
composite upload size check.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace asyncio.gather with asyncio.TaskGroup for structured
concurrency, and offload gzip compression to a thread to avoid
blocking the event loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

asyncio.TaskGroup requires Python 3.11+, which the SDK's type checker
does not support. Revert to asyncio.gather for broader compatibility.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants