Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retry tests with current fb #1966

Open
wants to merge 75 commits into
base: master
Choose a base branch
from
Open

Conversation

vanhauser-thc
Copy link
Collaborator

No description provided.

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-07-aflpp --fuzzers aflplusplus_early aflplusplus_last

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-07-aflpp2 --fuzzers aflplusplus_early aflplusplus_last

@vanhauser-thc
Copy link
Collaborator Author

vanhauser-thc commented Apr 7, 2024

@DonggeLiu I have big troubles getting the benchmarks working.

locally for me everything compiles fine, e.g. re2_fuzzer:

$ make test-run-aflplusplus_early-re2_fuzzer
docker build \
--tag gcr.io/fuzzbench/builders/benchmark/re2_fuzzer \
--build-arg BUILDKIT_INLINE_CACHE=1 \
--cache-from gcr.io/fuzzbench/builders/benchmark/re2_fuzzer \
--file benchmarks/re2_fuzzer/Dockerfile \
benchmarks/re2_fuzzer
[+] Building 12.2s (12/12) FINISHED                              docker:default
...
[*] Fuzzing test case #1848 (1882 total, 0 crashes saved, state: started :-), mode=explore, perf_score=300, weight=inf, favorite=1, was_fuzzed=0, exec_us=0, hits=0, map=337, ascii=0, run_time=0:00:00:14)...
INFO:root:Doing final sync.

but when building here I get

Step #2 - "aflplusplus_early-re2_fuzzer-builder-intermediate": #7 ERROR: executor failed running [/bin/sh -c apt-get install -y lsb-release software-properties-common gnupg wget]: exit code: 100
Step #2 - "aflplusplus_early-re2_fuzzer-builder-intermediate": ------
Step #2 - "aflplusplus_early-re2_fuzzer-builder-intermediate":  > [3/6] RUN apt-get install -y lsb-release software-properties-common gnupg wget:
Step #2 - "aflplusplus_early-re2_fuzzer-builder-intermediate": ------
Step #2 - "aflplusplus_early-re2_fuzzer-builder-intermediate": executor failed running [/bin/sh -c apt-get install -y lsb-release software-properties-common gnupg wget]: exit code: 100
Finished Step #2 - "aflplusplus_early-re2_fuzzer-builder-intermediate"
ERROR
ERROR: build step 2 "gcr.io/cloud-builders/docker" failed: step exited with non-zero status: 1
------------------

and that makes no sense because for one fuzzer the same two targets always build on fuzzbench (ossfuzz, openh264), and all other fail. but the other fuzzer instance succeeds in all targets. and the only difference between the two in builder.Dockerfile is:

@@ -41,7 +41,7 @@
 ENV LLVM_CONFIG=llvm-config-18
 
 # Download afl++.
-RUN git clone -b early https://github.com/AFLplusplus/AFLplusplus /afl && \
+RUN git clone -b last https://github.com/AFLplusplus/AFLplusplus /afl && \
     cd /afl && \
     true

do you have any idea what is going wrong?
btw. you can kill all afl++ fuzzing instances ...

@DonggeLiu
Copy link
Contributor

ERROR: executor failed running [/bin/sh -c apt-get install -y lsb-release software-properties-common gnupg wget]: exit code: 100

Could flaky network issues cause this?

In the past, I recall seeing apt-get fail due to a network problem and then work again after a few hours in CI tests.
Maybe re-try it in a few hours and see if it occurs again?

@vanhauser-thc
Copy link
Collaborator Author

I have this issue since Friday. And if it would be a network issue it would affect both fuzzers and random targets

@vanhauser-thc
Copy link
Collaborator Author

And you can see it works in the ci too - it’s green for most

@DonggeLiu
Copy link
Contributor

DonggeLiu commented Apr 8, 2024

I have this issue since Friday. And if it would be a network issue it would affect both fuzzers and random targets

I see. That's strange because I don't recall changing anything related last week.
Unfortunately, I will need more time before I can debug this because I am currently occupied by other tasks.

For now, I can:

  1. Cancel all AFL++ experiments, and
  2. Re-launch the experiment here, just in case my account makes any difference (which is unexpected).

Meanwhile, I guess two potential ways may help us understand this error better:

  1. Split the apt-get command and install one package in each. This helps us see which one caused the failure.
  2. Add apt-get update && before the apt-get that caused the error. This should be unnecessary because your first RUN command already did it, but it can at least rule out a possibility.

@DonggeLiu
Copy link
Contributor

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-08-aflpp --fuzzers aflplusplus_early aflplusplus_last

@vanhauser-thc
Copy link
Collaborator Author

Same in your run. Ci is green for the targets, but for the fuzzing the same fuzzer only one target built successfully :(
And only half of the targets are there. Weird.

the test is important because this is testing a major change for llvm 16+ and we need a release very soon

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-08-aflpp3 --fuzzers aflplusplus_early aflplusplus_last

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-09-aflpp --fuzzers aflplusplus_early aflplusplus_last

@vanhauser-thc
Copy link
Collaborator Author

@DonggeLiu it worked when I switched to llvm 16 (or the issue just dissolved for other reasons). trying llvm 19 now.

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-04-09-aflpp2 --fuzzers aflplusplus_early aflplusplus_last

@jonathanmetzman
Copy link
Contributor

I'm guessing the differences between this happening in prod vs local are because of caching.
I agree with Dongge that the issue looks like it is caused by not having apt-get update && before apt-get install

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-05-18-aflpp --fuzzers aflpp aflpp2

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-05-19-aflpp --fuzzers aflpp aflpp2

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-05-20-aflpp --fuzzers aflpp aflpp2

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-05-23-aflpp --fuzzers aflplusplus aflplusplus_weight0 aflplusplus_weight1

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-05-24-aflpp --fuzzers aflplusplus aflplusplus_weight0 aflplusplus_aweight0 aflplusplus_aweight1 --benchmarks bloaty_fuzz_target curl_curl_fuzzer_http freetype2_ftfuzzer harfbuzz_hb-shape-fuzzer jsoncpp_jsoncpp_fuzzer lcms_cms_transform_fuzzer libjpeg-turbo_libjpeg_turbo_fuzzer libpcap_fuzz_both libpng_libpng_read_fuzzer libxml2_xml libxslt_xpath mbedtls_fuzz_dtlsclient openh264_decoder_fuzzer openssl_x509 openthread_ot-ip6-send-fuzzer proj4_proj_crs_to_crs_fuzzer re2_fuzzer sqlite3_ossfuzz stb_stbi_read_fuzzer systemd_fuzz-link-parser vorbis_decode_fuzzer woff2_convert_woff2ttf_fuzzer zlib_zlib_uncompress_fuzzer

@vanhauser-thc
Copy link
Collaborator Author

@jonathanmetzman this is what I meant with the issues I have on fuzzbench:

Everything built fine for https://www.fuzzbench.com/reports/experimental/2024-05-23-aflpp/index.html

in https://www.fuzzbench.com/reports/experimental/2024-05-24-aflpp/index.html I didn’t change these but added two more. The ones I added are fine but the original two now have one target that didn’t built.

I didn’t check the build logs what exactly went wrong but either way it is something that fuzzbench should detect and retry, wiping a cache beforehand etc or whatever is causing this.

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-09-09-aflpp --fuzzers aflplusplus_nocmplog libafl_fuzz

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-09-19-aflpp --fuzzers aflplusplus_nocmplog libafl_fuzz

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-09-20-aflpp --fuzzers aflplusplus_nocmplog aflplusplus_vp0 aflplusplus_vp1

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-09-20-aflpp2 --fuzzers aflplusplus_nocmplog aflplusplus_vp0 aflplusplus_vp1

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-09-20-aflpp3 --fuzzers aflplusplus_nocmplog aflplusplus_vp0 aflplusplus_vp1

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-09-21-aflpp --fuzzers aflplusplus_nocmplog aflplusplus_vp0 aflplusplus_vp1

@vanhauser-thc
Copy link
Collaborator Author

@DonggeLiu the last 3 experiments are not starting. There were built successfully say the logs though and I updated to the newest commit. Do you know why?

@vanhauser-thc
Copy link
Collaborator Author

@DonggeLiu I implemented value profile for afl++ (without queue pollution, but currently less solving mechanisms, want to test it first) - and would like to test it ... :)

@DonggeLiu
Copy link
Contributor

Thanks for pinging me on this: I wanted to take look earlier but was busy with other things...

Let me try triggering the exp again in case it is flaky, will look into the logs if that fails.

@DonggeLiu
Copy link
Contributor

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-09-24-aflpp --fuzzers aflplusplus_nocmplog aflplusplus_vp0 aflplusplus_vp1

@DonggeLiu
Copy link
Contributor

DonggeLiu commented Sep 24, 2024

Hmm, I think 2024-09-21-aflpp works fine?
https://storage.googleapis.com/www.fuzzbench.com/reports/experimental/2024-09-21-aflpp/index.html

The log console also shows it terminated successfully:
image

(Taking a wild guess) Maybe you checked the non-experimental link?
IIUC, FuzzBench uses the experimental link for non-core fuzzers (e.g., the ones above) and non-experimental link for core fuzzers (afl++, libfuzzer, etc.)
The difference is the experimental/ subdir in the link.

@vanhauser-thc
Copy link
Collaborator Author

Weird now they are all there, but they weren’t on the weekend

@vanhauser-thc
Copy link
Collaborator Author

/gcbrun run_experiment.py -a --experiment-config /opt/fuzzbench/service/experiment-config.yaml --experiment-name 2024-09-25-aflpp --fuzzers aflplusplus_nocmplog aflplusplus_vp0 aflplusplus_vp1

@vanhauser-thc
Copy link
Collaborator Author

@DonggeLiu OK I see the same thing again - the report is not showing up.
However I see that it is running: https://storage.googleapis.com/fuzzbench-data/index.html?prefix=2024-09-25-aflpp/experiment-folders/

but it is neither here https://www.fuzzbench.com/reports/experimental/index.html nor there https://storage.googleapis.com/www.fuzzbench.com/reports/experimental/2024-09-25-aflpp/index.html

My guess is it will only be visible once the experiment is completed (which is why it showed up days later and are visible now, but not the days after I started them).
This is a new thing.

@DonggeLiu
Copy link
Contributor

but it is neither here https://www.fuzzbench.com/reports/experimental/index.html nor there https://storage.googleapis.com/www.fuzzbench.com/reports/experimental/2024-09-25-aflpp/index.html

Is this using a lot of space?
I saw messages like Retrying on experiment.runner.TrialRunner.archive_and_save_corpus failed with [Errno 28] No space left on device. Raise.:

image

My guess is it will only be visible once the experiment is completed (which is why it showed up days later and are visible now, but not the days after I started them). This is a new thing.

I don't think we have changed the report upload logic. It should be the same as before.

@vanhauser-thc
Copy link
Collaborator Author

No, nothing does create a lot of data, not more than usual.
but you see that it is running, and new report is coming up?

@vanhauser-thc
Copy link
Collaborator Author

@DonggeLiu now that the test is completed the report is up … and two fuzzers are missing, and they produced coverage so they should be there.

Also the results look suspect - no cmplog is better than with cmplog? No …

@DonggeLiu
Copy link
Contributor

Also the results look suspect - no cmplog is better than with cmplog? No …

I don't know the internal of AFL++ as well as you so I have no clue why that is better.

and two fuzzers are missing, and they produced coverage so they should be there.

Which 2 are missing?
I can search for their logs

@vanhauser-thc
Copy link
Collaborator Author

and two fuzzers are missing, and they produced coverage so they should be there.

Which 2 are missing? I can search for their logs

aflplusplus_vp0 aflplusplus_vp1

thank you!

@vanhauser-thc
Copy link
Collaborator Author

and two fuzzers are missing, and they produced coverage so they should be there.

Which 2 are missing? I can search for their logs

aflplusplus_vp0 aflplusplus_vp1

thank you!

@DonggeLiu reminder :) thank you!

@DonggeLiu
Copy link
Contributor

aflplusplus_vp0 aflplusplus_vp1
thank you!

@DonggeLiu reminder :) thank you!

Thanks for pinging me.
The gcloud log does not show anything suspicious, but the fuzzer run log seem to imply some early termination problem.

gcloud Log

Taking aflplusplus_vp0 as an example, there are only 2 kinds of warning/errors, e.g.:
image

The error message says instance 3191872 failed to upload corpus 0037 for some reason:
image

But:

  1. The corpus definitely exists because it can be found in the data dir.
  2. There were other earlier gsutil cp errors from the same instance, which does not block the experiment.

Exp Data Dir

IIUC, the fuzzer only seems to run for < 14 hours?
The last corpus archive was created 13.5 hours after the first one.
The fuzzer log did not record anything interesting after 10 hours.

Do you happen to know the cause?
Maybe adding more logs in the fuzzer can help investigate the cause.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants