Skip to content

feat: spec validation engine + parser + CLI#786

Open
mihirathale98 wants to merge 25 commits into
mainfrom
factory/run-0743ea86
Open

feat: spec validation engine + parser + CLI#786
mihirathale98 wants to merge 25 commits into
mainfrom
factory/run-0743ea86

Conversation

@mihirathale98

@mihirathale98 mihirathale98 commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add factory/spec/parser.py — regex-based Markdown parser for .factory/repo_spec.md with RepoSpec model containing modules, dependency edges, shared contracts, entry points, and change impact entries
  • Add factory/spec/validate.py — automated validation engine with path existence checks, Python import cross-referencing (via ast.parse), orphan/hub module detection, afferent/efferent coupling metrics, and instability computation
  • Add factory spec validate <project_path> CLI subcommand
  • Replace W₉ validate_stub placeholder with real validate FnNode running factory spec validate
  • Re-export parse_spec and validate_spec from factory.spec
  • Fix stale test_all_skills_exported assertion: expect 10 workflows (was 8) after spec-generate and spec-update additions

Test plan

  • 79 tests pass (parser: 29, validation: 22, existing spec generate: 28)
  • ruff check passes on all changed files
  • mypy passes on all changed files
  • Path existence check catches missing module paths
  • Import cross-ref detects phantom dependencies via ast.parse
  • Orphan detection flags modules with zero consumers
  • Hub detection flags modules with ≥5 dependents
  • Coupling metrics compute correct Ca/Ce/instability values
  • W₉ workflow graph validates with renamed node
  • Full test suite passes (2721 passed, 12 skipped)

@xukai92

xukai92 commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

are you implementing this using remote-factory it self? why it's not updating factory/discovery/spec.py

@xukai92

xukai92 commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

maybe because you are running it from an old commit. the spec.py was just merged this morning

@mihirathale98

Copy link
Copy Markdown
Collaborator Author

maybe because you are running it from an old commit. the spec.py was just merged this morning

oh yeah did not pull it ig

@xukai92

xukai92 commented Jun 25, 2026

Copy link
Copy Markdown
Collaborator

i think your version is better than what's in main. 2 things i'd like to preserve to have it merged

  1. make it respect the symphony style (https://github.com/openai/symphony/blob/main/SPEC.md#normative-language) because this is what we have been assuming
  2. make sure it works with the existing workflow (see how it's used in main)

@mihirathale98 mihirathale98 changed the title feat: W₉ Spec Generate workflow + spec module scaffold feat: spec validation engine + parser + CLI Jun 25, 2026
@mihirathale98

Copy link
Copy Markdown
Collaborator Author

i think your version is better than what's in main. 2 things i'd like to preserve to have it merged

  1. make it respect the symphony style (https://github.com/openai/symphony/blob/main/SPEC.md#normative-language) because this is what we have been assuming
  2. make sure it works with the existing workflow (see how it's used in main)
  1. Yes, I am working on it. Looked at the symphony spec, I will take most of the design from them and add some of the graphical thoughts I have in my mind. I just launched it with an initial draft, to see what it builds.

  2. Will rebase and use it as it is used in the current workflow.

Scaffold the spec module with source file collection (respecting
.gitignore, excluding node_modules/.factory/__pycache__), file batching
for Haiku context window, and async generate_spec() entry point.

Add spec_extractor.md (Haiku extraction prompt) and spec_annotator.md
(Researcher annotation prompt) for the two-stage pipeline.
Define spec_generate_workflow() in definitions.py with the
extract → gate → annotate → gate → validate_stub → gate pipeline.
Register as 'spec-generate' in register_all().

Add 'factory spec generate <path>' CLI subcommand with argparse
sub-subparser and handler registration.

Add comprehensive tests for source file collection, file batching,
workflow graph validation, and node structure.
Narrow getattr result from Any | None to str via str() cast, and
align fallback lambda parameter name with cmd_spec_generate signature.
Add factory/spec/parser.py with RepoSpec model and parse_spec()
for regex-based Markdown parsing of .factory/repo_spec.md.

Add factory/spec/validate.py with validate_spec() that runs path
existence checks, Python import cross-referencing via ast.parse,
orphan/hub module detection, and afferent/efferent coupling
metrics with instability computation. Writes results to
.factory/spec_validation.md.
Replace the placeholder echo PASS FnNode with a real validate
node that runs 'factory spec validate {project_path}', reading
repo_spec.md and writing spec_validation.md. Update existing
tests for the renamed node.
Wire cmd_spec_validate handler into the spec subparser and
handler dict. Re-export parse_spec and validate_spec from the
spec module __init__.
Test Markdown parsing (modules, edges, contracts, entry points,
change impact), path existence checks, Python import cross-ref,
orphan/hub detection, coupling metrics, and validation report
output.
Add factory/spec/update.py with DiffScope model, scope_diff() for
mapping diffs to affected spec modules, and update_spec() async
orchestration entry point. Add spec_patcher.md agent prompt for
incremental Haiku-based spec patching. Update __init__.py re-exports.
Add spec_update_workflow() defining W₁₀: diff_scope → patch →
gate_patch → revalidate → gate_revalidate. Add non-blocking
spec_update FnNode to improve and research workflows after
archivist, conditionally running if repo_spec.md exists.
Add cmd_spec_scope and cmd_spec_update handlers. Wire scope and
update as sub-subcommands under the spec subparser with event
emission for observability.
Add tests/test_spec_update.py with 34 tests covering diff parsing,
file-to-module mapping, scope formatting, scope_diff integration,
W₁₀ graph validation, registry count, and improve workflow
spec_update node integration. Update existing test assertions for
10-workflow registry count.
Add get_impact() to extract the subgraph centered on a named module
from the repo spec, returning a compact Markdown snippet for agent
context inclusion. Wire as 'factory spec impact <module> --project <path>'.
Add conditional 'Repo Spec (if available)' sections to strategist,
builder, qa, and ceo prompts so agents consult the spec for dependency
analysis, blast radius, and change impact during their workflows.
The test asserted 8 workflows but register_all() now returns 10
after spec-generate and spec-update were added.
Rename all file path references from repo_spec.md to GRAPH-SPEC.md
across Python source, agent prompts, workflow definitions, and tests.
Variable and function names (parse_spec, validate_spec, etc.) are
unchanged — only the on-disk filename changes.
Modify resolve_spec() to look for GRAPH-SPEC.md (committed at root,
then generated in .factory/). Replace procedural generate_spec() body
with delegation to factory.spec.generate.generate_spec(). Remove old
helpers (_read_readme_summary, _detect_source_dirs, _read_top_level_deps,
_fetch_github_issues) — the agent pipeline handles extraction directly.
Rename section heading from SPEC.md to GRAPH-SPEC. Update status
messages to reference GRAPH-SPEC.md paths and the new CLI command
'factory spec generate' for absent specs.
Replace SPEC.md Diff with GRAPH-SPEC Diff throughout. Update diff
format from section-level entries (Section 2.3) to module-level
entries (module 'store'). Remove redundant Repo Spec section heading
in favor of GRAPH-SPEC.
Opus provides stronger architectural reasoning needed for RFC-style
spec generation with structural graph sections, normative language
precision, and coupling metric computation. Increase batch token
limit from 80K to 160K to leverage Opus's larger context window.
Replace Python ast.parse and regex-based import cross-referencing
with a Haiku agent call for language-agnostic verification. Keep
tier 1 structural checks (path existence, orphan/hub detection,
coupling metrics) as pure Python. Update tests to mock the Haiku
invoke_agent call.
Add ProjectIdentity, goals, abstraction levels, and coupling metrics
to the parser. Update agent prompts for RFC-style output. Fix regex
quantifiers in _extract_section and _parse_table_rows that prevented
parsing numbered section headers (e.g. "## 5. Change Impact").
Update builder, QA, and CEO prompt headings from
"## Repo Spec (if available)" to "## GRAPH-SPEC (if available)".
@mihirathale98 mihirathale98 force-pushed the factory/run-0743ea86 branch from f2b73ae to 007dc7d Compare June 26, 2026 19:06
@github-actions

github-actions Bot commented Jun 26, 2026

Copy link
Copy Markdown

Sentrux Quality Report

Absolute

Scanning ....
[scan] git ls-files: 325 total, 313 kept, 12 dropped (ext:12, meta:0, big:0)
[build_project_map] 313 files, 54 unique dirs, 50 cache misses, 2.3ms
[resolve] 461 resolved, 797 unresolved (of 1258 total specs)
[resolve_imports] project_map 2.4ms, suffix_idx 0.6ms, suffix_resolve 9.6ms, total 12.6ms
[build_graphs] 313 files | maps 1.1ms, imports 12.7ms, calls+inherit 3.6ms, total 17.4ms | 460 import, 4557 call, 0 inherit edges
sentrux check — 2 rules checked

Quality: 4551

✗ [Error] max_cc: 5 function(s) exceed max cyclomatic complexity of 30
    factory/cli.py:cmd_ceo (cc=79)
    factory/study.py:study_project_local (cc=43)
    factory/cli.py:_welcome_wizard (cc=39)
    factory/cli.py:cmd_run (cc=37)
    factory/workflow/validation.py:validate_workflow (cc=31)

✗ 1 violation(s) found

Diff (vs base branch)

Scanning ....
[scan] git ls-files: 325 total, 313 kept, 12 dropped (ext:12, meta:0, big:0)
[build_project_map] 313 files, 54 unique dirs, 50 cache misses, 2.5ms
[resolve] 461 resolved, 797 unresolved (of 1258 total specs)
[resolve_imports] project_map 2.5ms, suffix_idx 0.7ms, suffix_resolve 9.6ms, total 12.8ms
[build_graphs] 313 files | maps 1.1ms, imports 12.9ms, calls+inherit 3.3ms, total 17.2ms | 460 import, 4557 call, 0 inherit edges
sentrux gate — structural regression check

Quality:      4702 -> 4551
Coupling:     0.75 → 0.75
Cycles:       4 → 4
God files:    0 → 0

Distance from Main Sequence: 0.35

✓ No degradation detected

The spec format was renamed to GRAPH-SPEC but introspect.py still
checked for the old SPEC.md filename, causing has_spec to always
return false.
Replace per-file _is_gitignored subprocess calls with a single
_get_gitignored batch using git check-ignore --stdin.
The discover CLI tests were calling generate_spec which spawns a
real claude agent. CI has no claude binary, so mock the call to
return stub GRAPH-SPEC content.
@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 89.89501% with 77 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.56%. Comparing base (527199d) to head (e860ca7).
⚠️ Report is 11 commits behind head on main.

Files with missing lines Patch % Lines
factory/spec/generate.py 60.67% 35 Missing ⚠️
factory/spec/update.py 82.01% 25 Missing ⚠️
factory/spec/validate.py 94.61% 9 Missing ⚠️
factory/spec/parser.py 97.77% 5 Missing ⚠️
factory/study.py 92.00% 2 Missing ⚠️
factory/spec/impact.py 98.71% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #786      +/-   ##
==========================================
- Coverage   87.11%   86.56%   -0.55%     
==========================================
  Files          81       87       +6     
  Lines       12245    12891     +646     
==========================================
+ Hits        10667    11159     +492     
- Misses       1578     1732     +154     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@mihirathale98

Copy link
Copy Markdown
Collaborator Author

✅ Factory Review: KEEP

Verdict: KEEP
Reason:


Posted by Factory CEO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants