-
-
Notifications
You must be signed in to change notification settings - Fork 896
fix(cli): stop dev runs stuck in dequeued status (fix #2639) #2699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: bf8e277 The changes in this PR will be included in the next version bump. This PR includes changesets to release 24 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
WalkthroughThis pull request addresses two dev run issues. A changelog entry documents fixes for: (1) dev runs getting stuck in DEQUEUED status, and (2) preventing ENOENT "System failure" errors in dev runs after the first change. The fixes involve removing esbuild output directory cleanup from the dev session rebuild flow, and refactoring dev supervisor worker management by eliminating client-side active-old-worker filtering logic, increasing worker deletion delay from 1 to 5 seconds, and removing worker deletion checks that previously prevented deletion when in-progress runs existed. Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20–25 minutes
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
packages/cli-v3/src/dev/devSupervisor.ts (1)
600-613: Worker cleanup is incomplete – potential resource leak in long-running dev sessions.The method now only deprecates the version in the process pool but never removes workers from
this.workersmap or callsworker.stop(). Over an extended dev session with many code changes, the map and associated resources could accumulate indefinitely.Consider adding cleanup logic to remove workers from the map after deprecation and a grace period. For example:
#deleteWorker(friendlyId: string) { logger.debug("[DevSupervisor] Delete worker (if relevant)", { workerId: friendlyId, }); const worker = this.workers.get(friendlyId); if (!worker) { return; } if (worker.serverWorker?.version) { this.taskRunProcessPool?.deprecateVersion(worker.serverWorker?.version); } + + // Remove from map to prevent accumulation + this.workers.delete(friendlyId); + + // Optionally stop the worker's background process + worker.stop(); }Alternatively, document why workers are intentionally left in the map (e.g., if they're needed for debugging or if cleanup happens elsewhere).
🧹 Nitpick comments (1)
packages/cli-v3/src/dev/devSupervisor.ts (1)
595-598: The 5-second timeout is arbitrary and may not prevent all race conditions.While increasing from 1s to 5s gives more breathing room, it's still a fixed delay that doesn't account for actual run state. Long-running tasks could still be active when deprecation occurs.
Consider checking if the worker has active runs before deprecating, or using a signal-based approach rather than a fixed timeout. For example:
async #tryDeleteWorker(friendlyId: string) { await awaitTimeout(5_000); + // Only deprecate if no active runs remain + const worker = this.workers.get(friendlyId); + if (worker && this.#workerHasActiveRuns(worker)) { + return; + } this.#deleteWorker(friendlyId); }Note: This would require re-introducing logic to check for active runs, which was removed in this PR.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.changeset/young-ligers-suffer.md(1 hunks)packages/cli-v3/src/dev/devSession.ts(0 hunks)packages/cli-v3/src/dev/devSupervisor.ts(2 hunks)
💤 Files with no reviewable changes (1)
- packages/cli-v3/src/dev/devSession.ts
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-08T11:48:12.327Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2593
File: packages/core/src/v3/workers/warmStartClient.ts:168-170
Timestamp: 2025-10-08T11:48:12.327Z
Learning: The trigger.dev runners execute only in Node 21 and 22 environments, so modern Node.js APIs like AbortSignal.any (introduced in v20.3.0) are supported.
Applied to files:
.changeset/young-ligers-suffer.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (2)
.changeset/young-ligers-suffer.md (1)
1-6: Changelog entry is clear and accurately documents the fixes.The changeset follows the correct format with appropriate frontmatter, and both fix descriptions are user-facing, clear, and align with the related code changes (esbuild cleanup removal and dev supervisor worker management refactoring).
packages/cli-v3/src/dev/devSupervisor.ts (1)
269-272: Server-side dequeue handler confirmed to ignore oldWorkers—change verified as correct.The server-side route handler at
apps/webapp/app/routes/engine.v1.dev.dequeue.tsexplicitly states: "Even though we don't use it, we need to keep it for backwards compatibility." The request body schema is retained only for backward compatibility, but the parameters are never actually consumed by the handler. Passing an empty array foroldWorkersis safe and introduces no risk of regressions.
This PR fixes #2639, by no longer eagerly deleting "old" worker builds. This means that worker builds will persist to the filesystem now until the dev CLI is exited but this is only way to make sure that runs don't get stuck (the CLI fundamentally can't know when it's safe to delete worker builds).
This PR also fixes another issue that was occurring because we always immediately deleted the very first worker build when the first change was made. This would result in runs executing on the first version the dev CLI built failing, most of them with an ENOENT error because they were trying to execute or import a file that no longer existed.