Fix JVM deoptimization loop due to lambda misprediction#20832
Fix JVM deoptimization loop due to lambda misprediction#20832itschrispeck wants to merge 1 commit intoopensearch-project:mainfrom
Conversation
Signed-off-by: Christopher Peck <peck@uber.com>
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
|
❌ Gradle check result for 9887b0e: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #20832 +/- ##
============================================
- Coverage 73.31% 73.26% -0.06%
+ Complexity 72248 72230 -18
============================================
Files 5795 5795
Lines 330044 330060 +16
Branches 47641 47643 +2
============================================
- Hits 241975 241821 -154
- Misses 68609 68880 +271
+ Partials 19460 19359 -101 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
|
||
| BulkItemResponse primaryResponse = this.primaryResponse; | ||
|
|
||
| // Explicitly check if primaryResponse is null to avoid branching/mispredictions in a lambda |
There was a problem hiding this comment.
@itschrispeck @varunbharadwaj This comment will help, but it seems possible a well-meaning future developer could "simplify" this code by using writeOptionalWriteable() again. Could we refactor StreamOutput::writeOptionalWriteable to avoid the lamba instead of changing the code here?
Description
This change resolves a JVM deoptimization loop by refactoring how primaryResponse is written to the stream.
To do this, we explicitly check if
primaryResponse == null, avoiding the check within the lambda. This prevents the C2 compiler from generating aninvokedynamicinstruction, which is prone to permanent deoptimization when branch traffic flip-flops. The explicit if/else allows the JVM to compile both paths, relying on CPU branch prediction instead of a JVM trap.It also fixes a potential race condition where the volatile
primaryResponseis read once to check, and read again for use, though it may be modified in between.This pulls the
writeOptionalWriteablelogic outside, but this follows the convention of other places in the repo.Background
We recently found elevated CPU despite no change in load.

Profiling revealed it was due to a deopt/uncommon trap hit at

BulkItemRequest#writeThinThis is an example deopt event from the profile:
action = "none"implies that we've hit the recompilation cutoff for the method. Since lambdas useinvokedynamic, my understanding is that it's possible to flip-flop between predictions, eventually hitting the cutoff. In this case, the cutoff was hit while the wrong branch was compiled, and each invocation hits the trap.Related Issues
n/a
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.