Skip to content

feat: parallel dgoss execution via shared ParallelShellExecutor#588

Draft
ianpittwood wants to merge 16 commits into
mainfrom
feat/parallel-calls
Draft

feat: parallel dgoss execution via shared ParallelShellExecutor#588
ianpittwood wants to merge 16 commits into
mainfrom
feat/parallel-calls

Conversation

@ianpittwood

Copy link
Copy Markdown
Contributor

Summary

Adds bounded-concurrency parallel execution of dgoss test commands, built on a new tool-agnostic posit_bakery/parallel/ module so other plugins (hadolint, wizcli) and future sequential builds can adopt the same pattern.

  • posit_bakery/parallel/ShellTask, ShellResult, resolve_max_workers, and ParallelShellExecutor (a ThreadPoolExecutor-backed runner). Worker threads only run their subprocess and return captured bytes; the main thread drains completions, drives an optional Rich live status table, and invokes a per-result callback — so consumers mutate shared state without locks. Results are returned in input order; spawn failures are captured (never crash the run).
  • Concurrency control--jobs/-j CLI flag on bakery dgoss runBAKERY_MAX_CONCURRENCY env / SETTINGS.max_concurrency → default of 4 (modest, since each task is a Docker container). Malformed env values fall back to the default with a warning instead of crashing the CLI.
  • Streaming/logging — the live Progress table shares the same stderr_console as the logger, so per-task detail (flushed on completion, on the main thread) renders cleanly above the live region with no interleaving. The table auto-disables on non-TTY/CI, --quiet, or single-task runs, falling back to plain logging.
  • dgossDGossSuite.run() now dispatches through the executor while preserving its exact (GossJsonReportCollection, errors) return contract; per-result parse/persist/classify logic is unchanged.

Design spec and implementation plan live under docs/superpowers/ (gitignored, not committed).

Test Plan

  • just test — 1639 passed
  • New unit tests: executor (ordering, bounded concurrency, spawn-failure capture, env/cwd, main-thread callback, live-display policy), resolve_max_workers, settings env parsing (incl. malformed/empty fallback)
  • dgoss suite tests: mocked parallel happy-path, jobsmax_workers wiring, spawn-failure → BakeryDGossError
  • CLI test: dgoss run --jobs 2 forwards jobs=2 through to the suite
  • Manual smoke (requires Docker + built images): bakery dgoss run --jobs 2 -v shows the live status table + per-target detail, then the Goss summary

🤖 Generated with Claude Code

ianpittwood and others added 10 commits June 9, 2026 14:45
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…play

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…wiring and spawn failures

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 9, 2026

Copy link
Copy Markdown

Test Results

1 673 tests  +39   1 673 ✅ +39   8m 38s ⏱️ +35s
    1 suites ± 0       0 💤 ± 0 
    1 files   ± 0       0 ❌ ± 0 

Results for commit 816147c. ± Comparison against base commit 6bc441c.

♻️ This comment has been updated with latest results.

ianpittwood and others added 5 commits June 9, 2026 17:04
…xecutor

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… not orphaned

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…utor

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ll terminated

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ianpittwood

Copy link
Copy Markdown
Contributor Author

Follow-up: per-task timeout + responsive interruption

Added on top of the initial parallel-dgoss work (4 commits, b7b5c1c..bf28a59):

Responsive interruptionParallelShellExecutor now launches each child with start_new_session=True and tracks live processes in a lock-guarded registry. On KeyboardInterrupt/BaseException it cancels queued tasks and signals the whole process group (SIGTERM → 10s grace → SIGKILL) of every in-flight child, then re-raises. This fixes the prior behavior where Ctrl-C drained the full queue and could leave docker/dgoss containers running. Group-signaling (vs. signaling only the dgoss wrapper) is what prevents orphaned docker grandchildren; SIGTERM-first lets dgoss's EXIT trap remove its container. A registration-race window (a child spawned exactly as shutdown begins) is closed deterministically via a shutdown flag checked under the same lock.

Per-task timeoutShellTask.timeout enforced via Popen.communicate(timeout=…); on expiry the child group is stopped and the result flagged timed_out. Exposed on dgoss as GossOptions.timeout (default 900s = 15 min), resolved through DGossCommandShellTask (0 disables). A timed-out target is reported as a run-failing BakeryDGossError.

Tests — real-subprocess executor tests for timeout, SIGINT cancellation, whole-process-group termination (heartbeat-divergence), and the shutdown-race path; dgoss suite tests for timeout→error and timeout plumbing. Note: this also repaired two dgoss suite tests that the subprocess.runPopen switch had silently broken (they now mock ParallelShellExecutor.run). just test: 1649 passed.

…-based

The previous test triggered termination via a fixed 0.5s timeout, which under
heavy CI parallelism (pytest-xdist) loses the race to double-interpreter
startup: the process group was killed before the grandchild created its
heartbeat file, causing FileNotFoundError. Wait until the grandchild is
actually writing, then interrupt — deterministic regardless of runner load.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant