Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not pool connections before checkpoint restore #29006

Conversation

KyleAure
Copy link
Member

@KyleAure KyleAure commented Jul 9, 2024

@KyleAure
Copy link
Member Author

KyleAure commented Jul 9, 2024

#libby - check headers
#build - personal build (with checkpoint testing)

Running a personal build to verify behavior of checkpoint testing, since I cannot test this locally.

@LibbyBot
Copy link

LibbyBot commented Jul 9, 2024

Your personal pipeline request is at https://libh-proxy1.fyre.ibm.com/cognitive/pipelineAnalysis.html?uuid=1c1a2cb4-174e-468c-8117-ca674667facb

Target locations of links might be accessible only to IBM employees.

@LibbyBot
Copy link

LibbyBot commented Jul 9, 2024

Your personal build request is at https://wasrtc.hursley.ibm.com:9443/jazz/resource/itemOid/com.ibm.team.build.BuildResult/_Fp2UgD4mEe-Hn95lLM_tGQ

Target locations of links might be accessible only to IBM employees.

@LibbyBot
Copy link

The build KyleAure-29006-20240709-1313
https://wasrtc.hursley.ibm.com:9443/jazz/resource/itemOid/com.ibm.team.build.BuildResult/_Fp2UgD4mEe-Hn95lLM_tGQ
completed successfully!

@KyleAure
Copy link
Member Author

Database rotation pipeline: https://libh-proxy1.fyre.ibm.com/cognitive-dev/pipelineAnalysis.html?pipelineId=a02c738a-9f0a-4b5b-ab92-365837c8549f

Target locations of links might be accessible only to IBM employees.

@KyleAure KyleAure force-pushed the 24859-jakarta-data-eclipselink-instanton-fix branch 2 times, most recently from fb1c1fa to 2ca1332 Compare July 15, 2024 13:16
@KyleAure
Copy link
Member Author

#libby - check headers
#build - personal build (with checkpoint testing)

@LibbyBot
Copy link

Your personal build request is at https://wasrtc.hursley.ibm.com:9443/jazz/resource/itemOid/com.ibm.team.build.BuildResult/_LcXlAELBEe-F88sHKWQ5jw

Target locations of links might be accessible only to IBM employees.

@LibbyBot
Copy link

Your personal pipeline request is at https://libh-proxy1.fyre.ibm.com/cognitive/pipelineAnalysis.html?uuid=bb38a1ad-e287-465a-9867-5ce8edcde188

Target locations of links might be accessible only to IBM employees.

@LibbyBot
Copy link

The build KyleAure-29006-20240715-0953
https://wasrtc.hursley.ibm.com:9443/jazz/resource/itemOid/com.ibm.team.build.BuildResult/_LcXlAELBEe-F88sHKWQ5jw
completed successfully!

@KyleAure
Copy link
Member Author

Database rotation pipeline: https://libh-proxy1.fyre.ibm.com/cognitive-dev/pipelineAnalysis.html?pipelineId=e0061917-7e6d-4162-b41a-f384794d11c5

Target locations of links might be accessible only to IBM employees.

@KyleAure KyleAure force-pushed the 24859-jakarta-data-eclipselink-instanton-fix branch 2 times, most recently from 04f43c3 to 4163dfb Compare July 16, 2024 20:52
@KyleAure KyleAure changed the title WIP - Do not pool connections during checkpoint Do not pool connections before checkpoint restore Jul 16, 2024
@KyleAure KyleAure requested review from njr-11 and tjwatson July 16, 2024 21:05
Copy link
Contributor

@njr-11 njr-11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking it stale should effectively prevent pooling. I'd be concerned about side-effects though - copying Jim @jmstephensgit for that.
Some things to consider are could it throw off stale connection counts or cause connectionErrorOccurred to be sent when there isn't anything actually stale? And, could being marked stale impact subsequent use of the connection: Transaction enlistment, sharing, whether to subsequently raise connectionErrorOccurred if the connection truly does go stale.
Probably the safest option will be to create a new boolean to indicate poolable=false, and later check this as well as stale to prevent pooling.

@tjwatson
Copy link
Member

Could you rebase on the latest release branch that contains PR #28979 to make sure that InstantOn checkpoint tests passes?

Do not pool connections before checkpoint restore
@KyleAure KyleAure force-pushed the 24859-jakarta-data-eclipselink-instanton-fix branch from 4163dfb to 2761d96 Compare July 17, 2024 21:15
@KyleAure
Copy link
Member Author

KyleAure commented Jul 17, 2024

@njr-11 @jmstephensgit I updated this PR with the suggestion to use a new field to flag a mcWrapper as not poolable.
I tested it out locally and it works.
This will get fully tested by our Checkpoint tests for jakarta data.

@tjwatson I rebased onto integration for the next personal build.

@KyleAure
Copy link
Member Author

#libby - recheck headers
#build - personal build

njr-11
njr-11 previously approved these changes Jul 17, 2024
Copy link
Contributor

@njr-11 njr-11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I added one comment for Jim to consider when he reviews it.

if (mcWrapper.shouldBeDestroyed() || mcWrapper.hasFatalErrorNotificationOccurred(fatalErrorNotificationTime)
|| ((pm.agedTimeout != -1)
&& (mcWrapper.hasAgedTimedOut(pm.agedTimeoutMillis)))) {
if (mcWrapper.shouldBeDestroyed() || mcWrapper.hasFatalErrorNotificationOccurred(fatalErrorNotificationTime) || !mcWrapper.isPoolable()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be more a question for Jim, but should the method mcWrapper.shouldBeDestroyed() also be updated to consider the poolable=false value. If so, you would not need the || !mcWrapper.isPoolable() here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The isPoolable() is clear on its intended use.

If we need to optimize this area of the code under a different pull, its likely we should have a discard boolean that is set for any connections that should be discarded, but still have additional support variables that represent why its being discarded for trace and debugging. One discard boolean would clean up this code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@njr-11 I looked at other places we use shouldBeDestroyed() and it was typically used to check if their was already a connection error presented to the user. Otherwise, these debug messages are hidden. For example:

if (!mcWrapper.shouldBeDestroyed()) {
mcWrapper.markTransactionError();
Tr.error(tc, "XA_RESOURCE_ADAPTER_OPERATION_ID_EXCP_J2CA0027", "commit", xid, e, mcWrapper.gConfigProps.cfName);
}

Which is why I decided to add a new method

@LibbyBot
Copy link

Your personal pipeline request is at https://libh-proxy1.fyre.ibm.com/cognitive/pipelineAnalysis.html?uuid=d88c16cd-3928-4c85-8d44-147b9270f211

Target locations of links might be accessible only to IBM employees.

@LibbyBot
Copy link

Your personal build request is at https://wasrtc.hursley.ibm.com:9443/jazz/resource/itemOid/com.ibm.team.build.BuildResult/_MhadkESaEe-ngPvsOTExmw

Target locations of links might be accessible only to IBM employees.

@LibbyBot
Copy link

Code analysis and actions

DO NOT DELETE THIS COMMENT.
  • 6 product code files were changed.
  • Please describe in a separate comment how you tested your changes.

@LibbyBot
Copy link

jmstephensgit
jmstephensgit previously approved these changes Jul 18, 2024
@KyleAure KyleAure dismissed stale reviews from jmstephensgit and njr-11 via fd681d2 July 18, 2024 14:32
@KyleAure
Copy link
Member Author

After talking this over more with @tjwatson it would seem this is not the correct direction to go.
The concern is that even if the connection is closed before checkpoint the common

scenario is that you are containerizing your application in a build pipeline. During that build we do a checkpoint that can be stored in the container image. We cannot make external connections to a database during that pipeline build of the container image for many reasons. It would store credentials in the image, which is bad practice. It would try to connect to some production database during container image build, also bad.

So really this change is just masking the problem.
Instead we should try to get Eclipselink to hold off on creating a connections until the EntityManager is actually used for the first time.

@KyleAure KyleAure closed this Jul 18, 2024
@tjwatson
Copy link
Member

Instead we should try to get Eclipselink to hold off on creating a connections until the EntityManager is actually used for the first time.

That sounds nice for InstantOn scenario. But may cause some unexpected latency for other scenarios where we would have done this work as part of server startup?

@KyleAure
Copy link
Member Author

But may cause some unexpected latency for other scenarios where we would have done this work as part of server startup?

That is true, but the latency here is all related to the opening of the first connection.
I think it's common place for this type of action to be lazy and to happen when the resource is actually needed.
We do have a feature (in progress) for pre-populating the connection pool which could be the alternative solution if a user would rather not wait for their first connection within their application.

Opened an issue here: eclipse-ee4j/eclipselink#2213

@LibbyBot
Copy link

The build KyleAure-29006-20240717-1819
https://wasrtc.hursley.ibm.com:9443/jazz/resource/itemOid/com.ibm.team.build.BuildResult/_MhadkESaEe-ngPvsOTExmw
completed successfully!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants