Skip to content

fix: benchmark fixtures unusable — wrong format and missing is_attack labels (issue #48)#49

Merged
ksek87 merged 3 commits into
mainfrom
claude/roadmap-ticket-planning-GEbPu
May 25, 2026
Merged

fix: benchmark fixtures unusable — wrong format and missing is_attack labels (issue #48)#49
ksek87 merged 3 commits into
mainfrom
claude/roadmap-ticket-planning-GEbPu

Conversation

@ksek87
Copy link
Copy Markdown
Owner

@ksek87 ksek87 commented May 25, 2026

Closes #48

Summary

  • Converts all three bench/ fixture files from wrapped objects {"_meta":…,"tools":[…]} to flat JSON arrays as fuzzd benchmark expects, one tool per line
  • Adds "is_attack": true to each tool in mcptox_representative.json and mcptox_actual.json (these are all attack samples)
  • Adds "is_attack": false to each tool in clean_tools.json
  • Adds five regression tests in main.rs that parse the fixtures at test time and assert: representative and actual fixtures have attack labels, clean has none, recall is 1.0 against the representative set, and combined precision stays ≥ 0.90

Why it was never caught

Two silent failure modes stacked:

  1. The benchmark command expects a flat array but fixtures were wrapped objects — a parse error at runtime, but no test ever ran fuzzd benchmark --schema bench/*.json
  2. LabelledTool.meta.is_attack has #[serde(default)] so a missing field silently becomes false — no error, just wrong numbers

Test plan

  • fuzzd benchmark --schema bench/mcptox_representative.json → Precision: 1.000, Recall: 1.000, F1: 1.000
  • fuzzd benchmark --schema bench/clean_tools.json → 20 true negatives, 0 false positives
  • cargo test benchmark_fixture_tests → 5 tests pass

https://claude.ai/code/session_01G4f8mN9SeSHSGY1dWfFzih

… labels (issue #48)

Fixtures were wrapped objects {"_meta":…,"tools":[…]} and lacked is_attack
labels on each tool, making `fuzzd benchmark --schema bench/*.json` either
crash with a parse error or silently report garbage results (all detections
as false positives because is_attack defaulted to false).

- Convert all three bench fixtures to flat JSON arrays
- Add is_attack:true to each tool in mcptox_representative/actual (attack corpus)
- Add is_attack:false to each tool in clean_tools (benign corpus)
- Add three regression tests in main.rs that parse the fixture files directly
  and assert: representative has attack labels, clean has none, and recall ≥ 1.0
  against the representative set — so this format gap cannot regress silently

https://claude.ai/code/session_01G4f8mN9SeSHSGY1dWfFzih
@ksek87 ksek87 force-pushed the claude/roadmap-ticket-planning-GEbPu branch from 64d3d06 to 649932e Compare May 25, 2026 18:42
claude added 2 commits May 25, 2026 18:50
…ecision bound

- actual_fixture_parses_and_has_attack_labels: verifies mcptox_actual.json
  (485 tools) parses correctly and every entry has is_attack=true; previously
  the largest fixture had zero test coverage
- combined_benchmark_precision_within_bounds: runs the full attack+clean
  benchmark and asserts precision >= 0.90, locking in the current FP count
  so a regression that adds new false positives is caught immediately

https://claude.ai/code/session_01G4f8mN9SeSHSGY1dWfFzih
…(v0.9 done)

- README.md: mark v0.8 and v0.9 as Done in roadmap; remove their now-stale
  milestone detail sections; update signal table from 13 to 21 entries adding
  the 8 new v0.9 signals; update architecture diagram to 21 variants / 155 AC
  patterns; note inputSchema scanning in the scanner description
- bench/README.md: update signal distribution header to 21 signals / 155 patterns;
  replace stale coverage gap notes for #34/#35 with a short done-status note;
  add the 8 new signals to the signal table; fix the "Adding to the benchmark"
  _meta example to use the new is_attack:true format instead of the old taxonomy fields

https://claude.ai/code/session_01G4f8mN9SeSHSGY1dWfFzih
@ksek87 ksek87 merged commit 6ec3baa into main May 25, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: benchmark fixtures unusable out of the box — wrong format + missing is_attack labels

2 participants