Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix trampolines when startPC returns error value #7550

Merged
merged 1 commit into from
Nov 28, 2024

Conversation

IBMJimmyk
Copy link
Contributor

@IBMJimmyk IBMJimmyk commented Nov 19, 2024

When finding trampolines or syncing temporary trampolines, there is a call to startPC with the method to try and get the start PC. If the latest compilation of that method failed, startPC may return zero or a negative value. Both cases are not a real start PC and can not be used as a target for a trampoline. Attempting to do so will cause an error.

This change adds error checking when trying to create or modify a trampoline. When a zero or negative start PC is returned from the call to startPC, the previous start PC is used instead.

Fixes:
eclipse-openj9/openj9#20567

Edit:
Related to:
eclipse-openj9/openj9#20657
They don't need to go in at the same time since this is currently broken and both PRs are needed to fix the issue. Checking in OpenJ9 or OMR first won't make the problem worse.

@@ -587,6 +587,11 @@ OMR::CodeCache::findTrampoline(TR_OpaqueMethodBlock * method)
{
void *newPC = (void *) TR::Compiler->mtd.startPC(method);

if (reinterpret_cast<intptr_t>(newPC) <= 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't look right.
if there is no existing trampoline, the _currentStartPC (associated with the trampoline) must be NULL. it didn't mean the method hasn't been compiled though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. I'll need to take a look at this again.

if (trampoline && entry->_info._resolved._currentStartPC != newPC)
bool forceCreateTrampoline = false;

if (reinterpret_cast<intptr_t>(newPC) <= 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the existence of temp trampoline means _currentStartPC must not be NULL, such that this fix could work. however, <=0 test is not right ... you cannot assume codeCache is not in 0x[8-F]..... areas. Rather, J9TR_MethodNotCompiledBit is defined for the testing ... low tagging really it is. Not sure it has been used in OMR side though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just thinking that this check might not be right. I will look at what it should be.

@IBMJimmyk
Copy link
Contributor Author

I talked to @dsouzai and he pointed out to me that the -3 is coming from J9_JIT_NEVER_TRANSLATE here:
https://github.com/eclipse-openj9/openj9/blob/48709bf2b235d4f374b48ac3052cedb89d47ef1a/runtime/oti/j9consts.h#L233
But a check for J9_STARTPC_NOT_TRANSLATED which is the same as J9TR_MethodNotCompiledBit should also work.

One problem is that these macros don't exist in OMR. But, this can be dealt with by modifying the OpenJ9 implementation of VMMethodEnv::startPC.

OMR version:
https://github.com/eclipse-openj9/openj9-omr/blob/ef89b64bcaf398492b7c748135b226ccbf76940b/compiler/env/OMRVMMethodEnv.hpp#L58-L62
Based on the comment the only error value is supposed to be 0 and anything else will be treated as a real startPC.
syncTempTrampolines doesn't check for 0 either so it still needs to be updated.

Openj9 version:
https://github.com/eclipse-openj9/openj9/blob/48709bf2b235d4f374b48ac3052cedb89d47ef1a/runtime/compiler/env/J9VMMethodEnv.cpp#L66-L71
The idea will be to add a check for J9_STARTPC_NOT_TRANSLATED here and return 0 for all the different error types.
This is only called from a few places which I will check over to make sure it all works out.

I changing this PR to WIP for now while I confirm the exact details.

@IBMJimmyk IBMJimmyk changed the title Fix trampolines when startPC returns error value WIP: Fix trampolines when startPC returns error value Nov 20, 2024
@IBMJimmyk
Copy link
Contributor Author

I changed the code to recognize 0 as the error return value. The OpenJ9 change to make startPC return 0 instead of -3 is here:
eclipse-openj9/openj9#20657

When syncing temporary trampolines, there is a call to startPC with the
callee method to try and get the start PC. If the latest compilation of
that method failed, startPC returns zero which is not a real start PC
and can not be used as a target for a trampoline. Attempting to do so
will cause an error.

This change adds error checking when trying to sync trampolines. When a
start PC of zero is returned from the call to startPC, the previous
start PC is used instead.

Signed-off-by: jimmyk <[email protected]>
@IBMJimmyk IBMJimmyk changed the title WIP: Fix trampolines when startPC returns error value Fix trampolines when startPC returns error value Nov 27, 2024
@IBMJimmyk
Copy link
Contributor Author

I've finished running my own personal tests on the change and everything seems to look good so I removed the WIP tag.

@dsouzai When you get a chance can you take a look at this?
@zl-wang Could you take another look as well?

Copy link
Contributor

@zl-wang zl-wang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good ...

@dsouzai
Copy link
Contributor

dsouzai commented Nov 27, 2024

jenkins build all

@dsouzai dsouzai self-assigned this Nov 27, 2024
@IBMJimmyk
Copy link
Contributor Author

The failing build didn't get far enough to even compile a build with my changes:
https://ci.eclipse.org/omr/job/PullRequest-linux_riscv64/20/console

Also:   org.jenkinsci.plugins.workflow.actions.ErrorAction$ErrorId: cd7fe8bc-5af8-46ee-8c80-5392a191d995
java.lang.NullPointerException: Cannot get property 'label' on null object
	at org.codehaus.groovy.runtime.NullObject.getProperty(NullObject.java:60)
	at org.codehaus.groovy.runtime.InvokerHelper.getProperty(InvokerHelper.java:190)
	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.getProperty(ScriptBytecodeAdapter.java:469)
	at PluginClassLoader for script-security//org.kohsuke.groovy.sandbox.impl.Checker$7.call(Checker.java:377)
	at PluginClassLoader for script-security//org.kohsuke.groovy.sandbox.impl.Checker.checkedGetProperty(Checker.java:379)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.sandbox.SandboxInvoker.getProperty(SandboxInvoker.java:29)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.LoggingInvoker.getProperty(LoggingInvoker.java:133)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.PropertyAccessBlock.rawGet(PropertyAccessBlock.java:20)
	at WorkflowScript.run(WorkflowScript:607)
	at ___cps.transform___(Native Method)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.get(PropertyishBlock.java:73)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.LValueBlock$GetAdapter.receive(LValueBlock.java:30)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.PropertyishBlock$ContinuationImpl.fixName(PropertyishBlock.java:65)
	at jdk.internal.reflect.GeneratedMethodAccessor452.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:575)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ContinuationPtr$ContinuationImpl.receive(ContinuationPtr.java:72)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.impl.ConstantBlock.eval(ConstantBlock.java:21)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.Next.step(Next.java:83)
	at PluginClassLoader for workflow-cps//com.cloudbees.groovy.cps.Continuable.run0(Continuable.java:147)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.SandboxContinuable.access$001(SandboxContinuable.java:17)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.SandboxContinuable.run0(SandboxContinuable.java:49)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThread.runNextChunk(CpsThread.java:180)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThreadGroup.run(CpsThreadGroup.java:422)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:330)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsThreadGroup$2.call(CpsThreadGroup.java:294)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService.lambda$wrap$4(CpsVmExecutorService.java:140)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:139)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at jenkins.security.ImpersonatingExecutorService$1.run(ImpersonatingExecutorService.java:68)
	at jenkins.util.ErrorLoggingExecutorService.lambda$wrap$0(ErrorLoggingExecutorService.java:51)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.call(CpsVmExecutorService.java:53)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.call(CpsVmExecutorService.java:50)
	at org.codehaus.groovy.runtime.GroovyCategorySupport$ThreadCategoryInfo.use(GroovyCategorySupport.java:136)
	at org.codehaus.groovy.runtime.GroovyCategorySupport.use(GroovyCategorySupport.java:275)
	at PluginClassLoader for workflow-cps//org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService.lambda$categoryThreadFactory$0(CpsVmExecutorService.java:50)
	at java.base/java.lang.Thread.run(Thread.java:857)

@dsouzai
Copy link
Contributor

dsouzai commented Nov 28, 2024

yeah i think riscv failues are a known issue.

@dsouzai dsouzai merged commit 8ff3b35 into eclipse-omr:master Nov 28, 2024
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants