fix: benchmark fixtures unusable — wrong format and missing is_attack labels (issue #48)#49
Merged
Merged
Conversation
… labels (issue #48) Fixtures were wrapped objects {"_meta":…,"tools":[…]} and lacked is_attack labels on each tool, making `fuzzd benchmark --schema bench/*.json` either crash with a parse error or silently report garbage results (all detections as false positives because is_attack defaulted to false). - Convert all three bench fixtures to flat JSON arrays - Add is_attack:true to each tool in mcptox_representative/actual (attack corpus) - Add is_attack:false to each tool in clean_tools (benign corpus) - Add three regression tests in main.rs that parse the fixture files directly and assert: representative has attack labels, clean has none, and recall ≥ 1.0 against the representative set — so this format gap cannot regress silently https://claude.ai/code/session_01G4f8mN9SeSHSGY1dWfFzih
64d3d06 to
649932e
Compare
…ecision bound - actual_fixture_parses_and_has_attack_labels: verifies mcptox_actual.json (485 tools) parses correctly and every entry has is_attack=true; previously the largest fixture had zero test coverage - combined_benchmark_precision_within_bounds: runs the full attack+clean benchmark and asserts precision >= 0.90, locking in the current FP count so a regression that adds new false positives is caught immediately https://claude.ai/code/session_01G4f8mN9SeSHSGY1dWfFzih
…(v0.9 done) - README.md: mark v0.8 and v0.9 as Done in roadmap; remove their now-stale milestone detail sections; update signal table from 13 to 21 entries adding the 8 new v0.9 signals; update architecture diagram to 21 variants / 155 AC patterns; note inputSchema scanning in the scanner description - bench/README.md: update signal distribution header to 21 signals / 155 patterns; replace stale coverage gap notes for #34/#35 with a short done-status note; add the 8 new signals to the signal table; fix the "Adding to the benchmark" _meta example to use the new is_attack:true format instead of the old taxonomy fields https://claude.ai/code/session_01G4f8mN9SeSHSGY1dWfFzih
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #48
Summary
bench/fixture files from wrapped objects{"_meta":…,"tools":[…]}to flat JSON arrays asfuzzd benchmarkexpects, one tool per line"is_attack": trueto each tool inmcptox_representative.jsonandmcptox_actual.json(these are all attack samples)"is_attack": falseto each tool inclean_tools.jsonmain.rsthat parse the fixtures at test time and assert: representative and actual fixtures have attack labels, clean has none, recall is 1.0 against the representative set, and combined precision stays ≥ 0.90Why it was never caught
Two silent failure modes stacked:
fuzzd benchmark --schema bench/*.jsonLabelledTool.meta.is_attackhas#[serde(default)]so a missing field silently becomesfalse— no error, just wrong numbersTest plan
fuzzd benchmark --schema bench/mcptox_representative.json→ Precision: 1.000, Recall: 1.000, F1: 1.000fuzzd benchmark --schema bench/clean_tools.json→ 20 true negatives, 0 false positivescargo test benchmark_fixture_tests→ 5 tests passhttps://claude.ai/code/session_01G4f8mN9SeSHSGY1dWfFzih