-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NoSuchFileException during dynamic Javac execution with experimental_delay_virtual_input_materialization #12904
Comments
I've been chasing down something similar internally. It's a mix of poor error reporting and the inherent hard-to-reproduce state of dynamic execution bugs. In the case I was looking at, a rule was creating outputs that were set non-readable, which under certain conditions would trigger a crash that looked like this. |
The internal problem didn't look similar to this. I have seen nothing like this. |
Perhaps eb762d4 in 4.1.0 will help here? |
I see it too when mixing remote with workers when using dynamic.---8<---8<--- Exception details ---8<---8<--- |
I notice the paths are all in the execroot, not the worker directories. So it might be well a race condition. I was just talking with @tjgq about the materialization logic and how the workers don't really need to re-read the file. Keeping the file content in memory would save some disk reads and hopefully prevent this problem. The expansion (reading files into the request arguments) happen at src/main/java/com/google/devtools/build/lib/worker/WorkerSpawnRunner.java:301, the writing of the files happens at src/main/java/com/google/devtools/build/lib/sandbox/SandboxHelpers.java:477. While the write itself is atomic, it's possible that other workers interfere later. |
If this is still happening, could you give some more details on what the command lines look like for these worker requests? Are there multiple or recursive flagfiles? |
Description of the problem:
When building java targets with dynamic execution, I frequently (but not always) get a crash compiling
java_library
targets. Stack traces below; they seem to be the same problem..params
file that is missing.--experimental_delay_virtual_input_materialization
enabled.--worker_sandboxing
doesn't help.--experimental_local_execution_delay
.Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.
I can't provide a minimal example because I don't have a public remote exec environment to share, but here are the flags I use for dynamic execution:
I've seen it with experimental_local_execution_delay as high as 250; a small value like 2 makes it occur almost every build.
What operating system are you running Bazel on?
Linux 5.1.0-1.el7.elrepo.x86_64
What's the output of
bazel info release
?release 3.7.2
Any other information, logs, or outputs that you want to share?
The text was updated successfully, but these errors were encountered: