fix(producer): drop empty trailing chunk slice in distributed render plan#1133
Merged
miguel-heygen merged 1 commit intoMay 30, 2026
Conversation
…plan resolveChunkPlan caps chunkCount at maxParallelChunks from the naive count, then rounds effectiveChunkSize up to ceil(totalFrames / chunkCount). When that ceil rounds up, the first (chunkCount - 1) chunks can already cover every frame, so buildChunkSlices emits a final slice with startFrame >= totalFrames — an empty [n, n) or inverted range. renderChunk rejects it (framesInChunk <= 0) and, under Step Functions retries, fails the whole distributed render even though [0, totalFrames) is fully covered. This is reachable from the user-facing CLI: `hyperframes lambda render --chunk-size 10 --max-parallel-chunks 12` on a ~4s/30fps (121-frame) composition yields chunkCount=12, effectiveChunkSize=11, and a 12th slice of [121, 121). Tighten chunkCount to ceil(totalFrames / effectiveChunkSize) after the size is finalized, so the union stays exactly [0, totalFrames) with no empty tail. This only lowers chunkCount in the explicit-small-chunkSize case; the auto-sized and large-chunkSize paths already satisfy ceil(totalFrames / effectiveChunkSize) >= chunkCount, so it's a no-op there (existing tests' chunkCount values are unchanged). Adds a regression test for the 121/10/12 case plus a grid property test asserting contiguous, non-empty, exact coverage across explicit sizes.
miguel-heygen
approved these changes
May 30, 2026
Collaborator
miguel-heygen
left a comment
There was a problem hiding this comment.
Clean, correct fix. The root cause is well-explained: effectiveChunkSize rounding up post-cap left a [totalFrames, totalFrames) tail that renderChunk rejects, wedging the whole distributed render. The tighten-after-finalize approach is the right spot to fix it — no change to auto-sized or large-chunkSize paths.
The grid property test covering 150 combinations (6 frame counts × 5 max-parallel × 5 chunk sizes) is exactly the right test for an arithmetic invariant like this. No issues.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug
resolveChunkPlanderiveschunkCountfrom the naive count (min(maxParallelChunks, ceil(totalFrames / resolvedChunkSize))), then roundseffectiveChunkSizeup toceil(totalFrames / chunkCount). When that ceil rounds up, the firstchunkCount - 1chunks can already cover every frame, sobuildChunkSlicesemits a final slice withstartFrame >= totalFrames— an empty[n, n)or inverted range.renderChunkthen rejects that slice (framesInChunk = endFrame - startFrame <= 0→RenderChunkValidationError), and under Step Functions retries it exhausts retries and fails the whole distributed render, even though[0, totalFrames)was already fully covered. It also violatesbuildChunkSlices's own documented contract ("the union is exactly[0, totalFrames)").Reachability
This is reachable straight from the user-facing CLI.
--chunk-sizeand--max-parallel-chunksare first-class flags onhyperframes lambda renderthat flow intoDistributedRenderConfig→resolveChunkPlan:on a ~4s / 30fps (121-frame) composition gives
resolvedChunkSize=10,chunkCount=min(12, 13)=12,effectiveChunkSize=max(10, ceil(121/12)=11)=11. Slices 0–10 cover[0, 121); slice 11 is[121, 121).The fix
Tighten
chunkCounttoceil(totalFrames / effectiveChunkSize)after the size is finalized, so the trailing empty slice is dropped and the union stays exactly[0, totalFrames). This only ever lowerschunkCountin the explicit-small-chunkSizecase; the auto-sized and large-chunkSizepaths already satisfyceil(totalFrames / effectiveChunkSize) >= chunkCount, so it's a no-op there (every existing test's assertedchunkCountis unchanged).maxParallelChunksis still respected.Tests
plan.test.tsgains the121 / 10 / 12regression case (assertschunkCount === 11and the last slice is[110, 121)) plus a grid property test over a range oftotalFrames × maxParallelChunks × explicit chunkSizeasserting every slice is non-empty, contiguous from 0, and ends exactly attotalFrames.