-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JFR main thread dies because build log file does not exist yet #62
Labels
bug
Something isn't working
Comments
anbrsap
added a commit
that referenced
this issue
Jan 18, 2021
Create the local build log file _before_ starting JFR, so that JFR's log copy does not fail. Should be reverted once a permanent fix for #62 has been established.
Closed
We have seen this before. Apparently, it is the job of the main thread to shutdown everything once the build is finished. |
This was referenced Jan 27, 2021
romanisb
pushed a commit
to romanisb/stewardci-jenkinsfilerunner-image
that referenced
this issue
Oct 19, 2022
romanisb
added a commit
to romanisb/jenkinsfile-runner
that referenced
this issue
Jun 14, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Problem
Sometimes Steward pipeline runs do not terminate correctly but get killed after the timeout period, although according to the pipeline log the pipeline finished (either successfully or with error).
At the beginning of the JFR container log the following messages could be found:
It seems that it happens more often if the Kubernetes node has all CPUs fully utilized while the JFR starts up, e.g. when many pipeline runs are triggered simultaneously.
Version Information
First Analysis
The stack trace shows the call stack of the main thread. The InvocationTargetException wraps the FileNotFoundException coming from the reflective call of io.jenkins.jenkinsfile.runner.Runner.run().
io.jenkins.jenkinsfile.runner.Runner.run() schedules a new build, waits until it has been started, and then starts copying the build log file contents to stdout.
Presumably, with our Elasticsearch Log Plug-In the creation of the local build log file happens some time after the build has been started. So there is a race condition between the Elasticsearch Log Plug-In initialization (creating the local log file) and log copy by io.jenkins.jenkinsfile.runner.Runner.run(). If the latter is faster, it fails to open the local build log file.
As the build execution happens in other threads, the pipeline can run to completion even after the main thread died. I currently don't know why the JVM doesn't terminate.
The text was updated successfully, but these errors were encountered: