Skip to content

feat(core): repair malformed llm grader output#933

Open
christso wants to merge 4 commits intomainfrom
feat/911-smart-llm-grader-retry
Open

feat(core): repair malformed llm grader output#933
christso wants to merge 4 commits intomainfrom
feat/911-smart-llm-grader-retry

Conversation

@christso
Copy link
Copy Markdown
Collaborator

@christso christso commented Apr 4, 2026

Closes #911

Summary

  • add a final structure-repair retry for llm-grader after the 3 standard attempts fail on malformed structured output
  • reuse the last invalid grader response plus validation error instead of re-grading from scratch
  • skip the repair path when the grader returned no content to salvage
  • document the AgentV OSS board claim workflow fix in AGENTS.md so missing project items are added before status updates

Verification

  • bun test packages/core/test/evaluation/evaluators.test.ts packages/core/test/evaluation/evaluators_variables.test.ts packages/core/test/evaluation/orchestrator.test.ts
  • pre-push hook passed: build, typecheck, lint, test, validate eval YAML files

Red/Green UAT

Red on main:

  • bun apps/cli/src/cli.ts eval /tmp/agentv-911-redgreen/repair.eval.yaml --target candidate_mock --output /tmp/agentv-911-redgreen/main.red.jsonl
  • result: repair-check was skipped with Grader parse failure after 3 attempts, and /tmp/agentv-911-redgreen/main.red.jsonl recorded execution_status: execution_error with score: 0

Green on this branch:

  • bun apps/cli/dist/cli.js eval /tmp/agentv-911-redgreen/repair.eval.yaml --target candidate_mock --output /tmp/agentv-911-redgreen/branch.green.jsonl
  • result: the same eval passed with score: 1 after the grader script received the structure-repair prompt, and /tmp/agentv-911-redgreen/branch.green.jsonl recorded execution_status: ok

@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 321f3e7
Status: ✅  Deploy successful!
Preview URL: https://a95c6d91.agentv.pages.dev
Branch Preview URL: https://feat-911-smart-llm-grader-re.agentv.pages.dev

View logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: smart LLM grader retry — reuse content and ask LLM to fix structure

1 participant