docs: add expected-behavior specs for all 9 factory agents by gx-ai-architect · Pull Request #790 · akashgit/remote-factory

gx-ai-architect · 2026-06-25T22:01:44Z

Summary

Closes #788

Creates detailed expected-behavior.md files for all 9 factory agents. These docs serve as diagnostic baselines — when an agent misbehaves, compare its execution trace against the expected behavior doc to pinpoint exactly where it diverged.

Files created (9 docs, ~2,500 lines total)

File	Lines	Workflows covered
`docs/expected-behaviors/ceo.md`	418	All 8 (Build, Design, Discover, Review, Improve, Research, Refine, Meta)
`docs/expected-behaviors/researcher.md`	302	Build, Improve, Research, Meta
`docs/expected-behaviors/strategist.md`	354	Build, Design, Improve, Research, Meta
`docs/expected-behaviors/builder.md`	315	Build, Design, Improve, Research, Refine, Meta
`docs/expected-behaviors/qa.md`	301	Improve, Research, Refine
`docs/expected-behaviors/archivist.md`	382	Build, Design, Improve, Research, Refine, Meta
`docs/expected-behaviors/failure-analyst.md`	136	Research
`docs/expected-behaviors/refiner.md`	160	Refine
`docs/expected-behaviors/profiler.md`	145	Cross-cutting

Each doc contains

Identity & Responsibility — what the agent IS vs IS NOT, relationship to other agents
Per-Workflow Behavior — for each workflow: phase, inputs, ordered steps, outputs, handoffs
Invariants — hard rules (MUST/NEVER/ALWAYS) to check first in any trace
Constraints & Forbidden Actions — exhaustive list of what the agent must not do
Failure Modes & Diagnostic Signals — table format with trace signals for each known failure
Interaction Protocol — output format, CEO review criteria

Verified against issue #783

The CEO doc explicitly covers the build-mode respawn loop bug (#783) with 3 distinct trace signals:

results.tsv header-only after build phases completed
Respawn input reporting "0/N phases complete" when git log shows N commits
current.md overwritten outside Strategist phase

QA verification

All docs verified against actual agent prompts and playbooks:

9/9 accuracy (all claims match actual prompt files)
8/9 completeness (builder.md Design workflow added in fix commit)
Cross-agent consistency verified (handoff descriptions match across docs)
Evaluator/QA naming ambiguity documented with clarification note

🤖 Generated with Claude Code

…ler agents Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-06-25T22:02:00Z

Sentrux Quality Report

Absolute

Scanning ....
[scan] git ls-files: 307 total, 295 kept, 12 dropped (ext:12, meta:0, big:0)
[build_project_map] 295 files, 53 unique dirs, 49 cache misses, 2.0ms
[resolve] 434 resolved, 748 unresolved (of 1182 total specs)
[resolve_imports] project_map 2.1ms, suffix_idx 0.6ms, suffix_resolve 7.0ms, total 9.7ms
[build_graphs] 295 files | maps 0.8ms, imports 9.7ms, calls+inherit 2.5ms, total 13.1ms | 433 import, 4356 call, 0 inherit edges
sentrux check — 2 rules checked

Quality: 4706

✗ [Error] max_cc: 5 function(s) exceed max cyclomatic complexity of 30
    factory/cli.py:cmd_ceo (cc=78)
    factory/study.py:study_project_local (cc=43)
    factory/cli.py:_welcome_wizard (cc=39)
    factory/cli.py:cmd_run (cc=36)
    factory/workflow/validation.py:validate_workflow (cc=31)

✗ 1 violation(s) found

Diff (vs base branch)

Scanning ....
[scan] git ls-files: 307 total, 295 kept, 12 dropped (ext:12, meta:0, big:0)
[build_project_map] 295 files, 53 unique dirs, 49 cache misses, 2.1ms
[resolve] 434 resolved, 748 unresolved (of 1182 total specs)
[resolve_imports] project_map 2.2ms, suffix_idx 0.6ms, suffix_resolve 7.1ms, total 9.8ms
[build_graphs] 295 files | maps 0.8ms, imports 9.9ms, calls+inherit 2.5ms, total 13.2ms | 433 import, 4356 call, 0 inherit edges
sentrux gate — structural regression check

Quality:      4706 -> 4706
Coupling:     0.75 → 0.75
Cycles:       4 → 4
God files:    0 → 0

Distance from Main Sequence: 0.35

✓ No degradation detected

codecov · 2026-06-25T22:12:03Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.78%. Comparing base (04ce092) to head (bf7a4cf).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #790   +/-   ##
=======================================
  Coverage   86.78%   86.78%           
=======================================
  Files          80       80           
  Lines       12134    12134           
=======================================
  Hits        10531    10531           
  Misses       1603     1603

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

xukai92 and others added 4 commits June 25, 2026 21:40

docs: add expected-behavior specs for Failure Analyst, Refiner, Profi…

32383ba

…ler agents Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add expected-behavior specs for Builder, QA, Archivist agents

59624b8

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: add expected-behavior specs for CEO, Researcher, Strategist agents

b2e8bad

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

docs: fix QA findings — add Design workflow, clarify Evaluator/QA naming

bf7a4cf

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add expected-behavior specs for all 9 factory agents#790

docs: add expected-behavior specs for all 9 factory agents#790
gx-ai-architect wants to merge 4 commits into
mainfrom
expected-behavior-docs

gx-ai-architect commented Jun 25, 2026

Uh oh!

github-actions Bot commented Jun 25, 2026

Uh oh!

codecov Bot commented Jun 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gx-ai-architect commented Jun 25, 2026

Summary

Files created (9 docs, ~2,500 lines total)

Each doc contains

Verified against issue #783

QA verification

Uh oh!

github-actions Bot commented Jun 25, 2026

Sentrux Quality Report

Absolute

Diff (vs base branch)

Uh oh!

codecov Bot commented Jun 25, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants