-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Add corpus support to local DotnetFuzzing runs and fuzz Deflate64 #121019
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
cc @MihaZupan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds corpus support to the DotnetFuzzing project for local runs, allowing fuzzers to use seed corpora instead of relying on dictionaries. The change improves fuzzing efficiency by providing proper initial test cases. The ZipArchiveFuzzer is updated to use a corpus, and a new Deflate64Fuzzer is introduced to test the managed Deflate64 decompression implementation.
- Adds corpus infrastructure to the fuzzing framework with validation and deployment logic
- Converts ZipArchiveFuzzer to use corpus instead of dictionary for better fuzzing effectiveness
- Introduces Deflate64Fuzzer to test internal Deflate64 decompression used in ZipArchive reading
Reviewed Changes
Copilot reviewed 6 out of 13 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| src/libraries/Fuzzing/DotnetFuzzing/Program.cs | Adds corpus directory handling, validation, and deployment logic for both OneFuzz and local runs |
| src/libraries/Fuzzing/DotnetFuzzing/IFuzzer.cs | Extends IFuzzer interface with optional Corpus property |
| src/libraries/Fuzzing/DotnetFuzzing/Fuzzers/ZipArchiveFuzzer.cs | Adds corpus property to use seed files instead of dictionary |
| src/libraries/Fuzzing/DotnetFuzzing/Fuzzers/Deflate64Fuzzer.cs | New fuzzer for testing Deflate64 decompression with reflection-based stream creation |
| src/libraries/Fuzzing/DotnetFuzzing/DotnetFuzzing.csproj | Simplifies fuzzer file inclusion using wildcard and adds corpus files to build output |
| eng/pipelines/libraries/fuzzing/deploy-to-onefuzz.yml | Adds OneFuzz deployment task for new Deflate64Fuzzer |
| TestArchive(CopyToRentedArray(bytes), bytes.Length, async: false).GetAwaiter().GetResult(); | ||
| TestArchive(CopyToRentedArray(bytes), bytes.Length, async: true).GetAwaiter().GetResult(); |
Copilot
AI
Oct 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fuzzer rents two separate arrays for synchronous and asynchronous test paths. Consider reusing a single rented array for both paths to reduce allocation overhead during fuzzing.
Copilot uses AI. Check for mistakes.
|
Tagging subscribers to this area: @dotnet/area-meta |
| string script = $"%~dp0/libfuzzer-dotnet.exe --target_path=%~dp0/DotnetFuzzing.exe --target_arg={fuzzer.Name}"; | ||
|
|
||
| if (fuzzer.Dictionary is not null) | ||
| // We don't support dictionaries and corpora at the same time yet, and some fuzzers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Any reason why not?
The corpus will be ignored by OneFuzz, but I don't see why we should block it locally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This changes only the local run script, OneFuzz runs are unaffected.
Since we don't have corpora setup for OneFuzz (yet), some fuzzers place example inputs as dictionary entries instead, but that does not work as well as having them in the corpus. I didn't want to remove the dictionaries entirely because that might slow down OneFuzz runs, so as a temporary compromise, corpus takes precedence over dictionary when running locally (as it is more efficient to omit the suboptimal dictionary in this case).
Of course, having both at the same time (corpus of example whole inputs, dictionary with the right "alphabet" from which to compose inputs) would be best and is something we should aim for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If those dictionaries aren't adding any value beyond seeding the initial corpus for OneFuzz, I think they should be fine to delete now - OneFuzz will reuse the current (already seeded) corpus for new runs for us.
| // use corpus as corpus if available as it is more effective that way. | ||
| if (fuzzer.Corpus is not null) | ||
| { | ||
| script += " %~dp0/corpus"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please move this after the "additional arguments" so that it becomes the last option.
Going by the text in https://llvm.org/docs/LibFuzzer.html#options
To run the fuzzer, pass zero or more corpus directories as command line arguments. The fuzzer will read test inputs from each of these corpus directories, and any new test inputs that are generated will be written back to the first corpus directory
This way if you use a custom folder when fuzzing locally, you'll see the inputs being written there instead of in the deployment folder.
This PR introduces corpus support for the DotnetFuzzing project. Currently this works for local runs only (OneFuzz requires creating a special container for corpus, which probably needs to be done manually -- once per fuzzer, this is left for future work).
Currently, some fuzzers (IMO incorrectly) use dictionaries as a replacement for lack of corpus support, which makes fuzzing inefficient. This PR prepares ground for future improvements in this regard.
To validate the concept, this PR converts ZipArchive fuzzing to use corpus instead of a dictionary, and adds code to fuzz Deflate64 (for which we have managed implementation internal to reading ZipArchives).