Skip to content

Use count-based progress for Docker image pulls#6371

Closed
agners wants to merge 3 commits intomainfrom
refactor-docker-pull-progress
Closed

Use count-based progress for Docker image pulls#6371
agners wants to merge 3 commits intomainfrom
refactor-docker-pull-progress

Conversation

@agners
Copy link
Copy Markdown
Member

@agners agners commented Dec 1, 2025

Proposed change

Refactor Docker image pull progress to use a simpler count-based approach where each layer contributes equally (100% / total_layers) regardless of size.

The core issue was that Docker rate-limits concurrent downloads (by default 3 at a time, see Docker DefaultMaxConcurrentDownloads constant) and reports layer sizes only when downloading starts. With the previous size-weighted progress calculation, large layers appearing late would cause progress to drop dramatically (e.g., 59% -> 29%) as the total size increased. We prevented the progress from going backwards, but in practice that meant the progress would stale for an extended amount of time.

The new approach:

  • Each layer contributes equally to overall progress
  • Per-layer progress: 70% download weight, 30% extraction weight
  • Progress only starts after first "Downloading" event (when full layer count is known)
  • Always caps at 99% - job completion handles final 100%

This moves progress tracking to a dedicated module (pull_progress.py) and removes the complex size-based scaling logic that tried to account for unknown layer sizes. With this, there is always some progress. Unfortunately, it also means that progress slows down towards the completion, since the larger layers are then still downloaded/extracted. But this is the best we can do currently.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New feature (which adds functionality to the supervisor)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Additional information

  • This PR fixes or closes issue: fixes #
  • This PR is related to issue:
  • Link to documentation pull request:
  • Link to cli pull request:
  • Link to client library pull request:

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Ruff (ruff format supervisor tests)
  • Tests have been added to verify that the new code works.

If API endpoints or add-on configuration are added/changed:

@agners agners added the refactor A code change that neither fixes a bug nor adds a feature label Dec 1, 2025
@agners agners force-pushed the refactor-docker-pull-progress branch 2 times, most recently from c70d3b8 to ffc3783 Compare December 1, 2025 13:47
@agners agners requested a review from mdegat01 December 1, 2025 17:13
@agners agners force-pushed the refactor-docker-pull-progress branch from 3b608f3 to 8350a24 Compare December 1, 2025 17:19
agners and others added 3 commits December 1, 2025 21:19
Refactor Docker image pull progress to use a simpler count-based approach
where each layer contributes equally (100% / total_layers) regardless of
size. This replaces the previous size-weighted calculation that was
susceptible to progress regression.

The core issue was that Docker rate-limits concurrent downloads (~3 at a
time) and reports layer sizes only when downloading starts. With size-
weighted progress, large layers appearing late would cause progress to
drop dramatically (e.g., 59% -> 29%) as the total size increased.

The new approach:
- Each layer contributes equally to overall progress
- Per-layer progress: 70% download weight, 30% extraction weight
- Progress only starts after first "Downloading" event (when layer
  count is known)
- Always caps at 99% - job completion handles final 100%

This simplifies the code by moving progress tracking to a dedicated
module (pull_progress.py) and removing complex size-based scaling logic
that tried to account for unknown layer sizes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Layers that already exist locally should not count towards download
progress since there's nothing to download for them. Only layers that
need pulling are included in the progress calculation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@agners agners force-pushed the refactor-docker-pull-progress branch from bb9145c to 87e1e7a Compare December 1, 2025 20:19
@agners agners marked this pull request as draft December 2, 2025 14:24
@agners
Copy link
Copy Markdown
Member Author

agners commented Feb 4, 2026

With #6379 landed this is obsolete. Closing.

@agners agners closed this Feb 4, 2026
@github-actions github-actions Bot locked and limited conversation to collaborators Feb 6, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

cla-signed refactor A code change that neither fixes a bug nor adds a feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant