-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatic signalling for LibAFL #5
Comments
I believe that the latter approach is better. Because already a large set of sophisticated handling for different kinds of coverage is provided by the library. If we switch to explicit signaling we probably end up with a limited basic coverage reporting which may not work as well as the existing ones. In this sense, I think we can take the advantage of the SanitizerCoverage which is already supported in
This should give us a version of the binary which is instrumented by both SymRustC and the sanitizer coverage calls. What remains is how to get the coverage report out of the binary. One inevitable thing we need to do is linking The second and more important challenge is to get the coverage report (the edge map) from the execution and give it to the observer. Again in their example, they do it simply by directly using the statically allocated array in the library as the target program is compiled into the fuzzer program and they are in the same memory space. In contrast, in our case, the binary is a separate process and the edge map will not be directly accessible. I guess the solution is to use the shared memory facilities. Inspired by the concolic observer in their example (see this and this), we need to create a shared memory in the fuzzer program, and later in the custom symcc runtime, overwrite the |
Thanks! After some experimentations, I can confirm that these options were indeed part of the missing puzzle pieces. At least, it allowed me to instrument the necessary
The overall compilation architecture of our Rust examples to fuzz will indeed be dependent on that of LibAFL. If LibAFL were originally constructed as being directly modified from the Rust compiler (or from the SymRustC compiler), then it would have been in principle possible to compile our examples in a standalone fashion (e.g. while imagining LibAFL embedding part of its own While actually thinking about the input space of Rust programs that we were initially targeting here, it might appear a bit ambitious to try doing the automatic signalling for an arbitrary Rust binary. In the meantime, instead of a binary, we could start supporting Rust Unfortunately, this will imply to redesign a little bit our Rust concolic examples provided as input. However, arguably, one might already notice that the input space of SymRustC is already restricted by the input set of programs that SymCC is supporting. For instance, a SymRustC program can only be concolic-executed when it is following the SymCC convention of using the precise |
In this example, we deactivate all
Whereas the LibAFL loop and all dependencies are compiled a first time using this regular command: Line 407 in e7eae0a
we explicitly focus on the harness function, and compile it again to be sanitized with libafl_targets :Line 415 in e7eae0a
(Without loss of functionalities, this is actually an over approximation as it is the full LibAFL loop that gets sanitized.) Regarding automation, a next step would be to see how we can embed the appropriate sanitizing information: Line 421 in e7eae0a
inside a respective libfuzzer_rust_concolic_instance/fuzzer/build.rs ...
|
I'm not sure about the internal dependency management of cargo, but maybe we can give the flags only to our harness library so |
One can solve this problem by taking advantage of the incremental recompilation offered by cargo: |
At the time of writing, we are using this trick to give the flags to the harness: Line 440 in 653042a
In particular, this does not work if we insert an additional Line 441 in 653042a
Hopefully, the timestamps and content of those rlib are not tracked by cargo during its detection of packages to be potentially recompiled. Consequently, the harness does not get recompiled here (if it were, then it would be recompiled by default without the flags):Line 442 in 653042a
|
For the future, we can consider direct rust instrumentation through |
We are interested to use together LibAFL with SymRustC in a generic way, i.e. having a framework taking an arbitrary Rust program in input and doing the whole simulation as automatic as possible.
At first sight, the following setting seems to solve the problem:
symrustc/Dockerfile
Line 463 in 33b425d
because here we are only specifying our Rust source
source_0_original_1c_rs
in input at a single location in the build phase.However, for an arbitrary Rust program, this turns out to be not satisfying: during its main simulation loop, it seems it is mandatory for LibAFL to know how far the Rust program is progressing, while that program is in execution. In LibAFL, this progress information can be either implemented:
explicit signalling
, as in:https://github.com/sfu-rsl/LibAFL/blob/59bb8e61856b22047f8e6e2787a3f6d90ae99006/fuzzers/baby_fuzzer/src/main.rs#L39
libafl_targets
, as in:https://github.com/sfu-rsl/LibAFL/blob/59bb8e61856b22047f8e6e2787a3f6d90ae99006/fuzzers/libfuzzer_stb_image_concolic/fuzzer/build.rs#L38
In particular, whereas the above Rust source
source_0_original_1c_rs
is not duplicated elsewhere (thus, satisfying our genericity constraint), that code is currently not usingexplicit signalling
, also not usinglibafl_targets
. It then gets compiled bylibafl_solving_build.sh
:symrustc/Dockerfile
Line 467 in 33b425d
and the resulting binary is dynamically called afterwards by LibAFL:
https://github.com/sfu-rsl/LibAFL/blob/59bb8e61856b22047f8e6e2787a3f6d90ae99006/fuzzers/libfuzzer_rust_concolic/fuzzer/src/main.rs#L189
https://github.com/sfu-rsl/LibAFL/blob/59bb8e61856b22047f8e6e2787a3f6d90ae99006/fuzzers/libfuzzer_rust_concolic/fuzzer/src/main.rs#L91
Note that, since it is instrumented by SymRustC, this binary may expect to be executed in a mode where the concolic run is disabled, as opposed to another different situation where the same binary expects to be executed in a mode where the concolic run is enabled:
https://github.com/sfu-rsl/LibAFL/blob/59bb8e61856b22047f8e6e2787a3f6d90ae99006/fuzzers/libfuzzer_rust_concolic/fuzzer/src/main.rs#L321
(The content of
target_symcc0.out
is exactly:symrustc/src/rs/libafl_solving_bin.sh
Line 12 in 33b425d
In particular, it is internally calling
target_symcc.out
.)To show that the
explicit signalling
solution can be straightforward to put in place (i.e. to show that theexplicit signalling
solution does not require significant knowledge in low-level Rust, C and LLVM programming), we provide another example calledsource_0_original_1c0_rs
where we manually insert multiple signalling near multipleif then else
of interests:https://github.com/sfu-rsl/LibAFL/blob/59bb8e61856b22047f8e6e2787a3f6d90ae99006/fuzzers/libfuzzer_rust_concolic_instance/fuzzer/src/main.rs#L217
Obviously, this solution is breaking our genericity requirement, since we had to duplicate that Rust code from:
symrustc/Dockerfile
Line 488 in 33b425d
symrustc/examples/source_0_original_1c0_rs/src/main.rs
Line 10 in 33b425d
In this issue, we are interested to modify the way our original Rust example gets automatically compiled in
libafl_solving_build.sh
:symrustc/Dockerfile
Line 492 in 33b425d
so that the Rust source in input is automatically annotated with
explicit signalling
calls, or is automatically linked to take advantage oflibafl_targets
. (Here, any solutions should be fine as long as the code gets ultimately automatically generated.)In both solutions, one has to make sure that the automatic transformation does not alter the original concolic capacity of the binary, because the binary may be ultimately invoked by LibAFL in different concolic setting (respectively, when the concolic mode is on and off).
The text was updated successfully, but these errors were encountered: