Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Google Cloud log for stages/agents logger #812

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

DonggeLiu
Copy link
Collaborator

@DonggeLiu DonggeLiu commented Feb 21, 2025

Otherwise they will show incorrect logging levels.
image

Others:

  1. No need to log the full chat_history (the 'warning' message in the fig above).
  2. Ignore the No MD5 checksum messages (from dill?).
  3. Reconsider if we want to print the (full) fuzz target, build script, and compilation error when logging the Results.
  4. Omit sysctl: setting key "vm.mmap_rnd_bits", ignoring: Read-only file system stderr when presenting it to LLM (and when saving it to BuildResult. It's a common but benign message that occurred on many projects but never affected compilation or LLM generation.

@DonggeLiu DonggeLiu marked this pull request as draft February 21, 2025 04:28
@DonggeLiu
Copy link
Collaborator Author

/gcbrun exp -n dg -ag

DonggeLiu added a commit that referenced this pull request Mar 3, 2025
1. [x] Reuse SemanticAnalyzer from the one-prompt workflow
2. [x] Replicate Enhancer from OnePromptEnhancer
3. [x] Last result can still be `BuildResult` if build failure.
4. [x] When Execution Fails, still record `BuildResult` without
`RunResult`.
5. [ ] Download [`xs`
benchmarks](https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2025-02-23-811-dg2-comparison/benchmark/output-xs-fxawaitimport/index.html)
to understand why its successful `BuildResult` was not recorded on final
experiment report after its run failure.
6. [x] Clean up commented code.
7. [x] Do not print `chat_histroy` when logging `Results`, which makes
it unreadable.
8. [x] Stringify `WorDir` and `Benchmark` in `Result`s.
9. [x] Maybe `write_result()` at the end of every stage, so that the
report is always up-to-date.

Next:
1. [ ] [Fix log type](#812) for easier debugging.
2. [ ] Understand and fix prompt [truncation logic
error](https://llm-exp.oss-fuzz.com/Result-reports/ofg-pr/2025-02-23-811-dg2-comparison/sample/output-xs-fxawaitimport/04.html).
3. [ ] Re-run cloud build if a flaky step failed (e.g., `apt install`).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant