Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OmeroRequest: close resources on client.joinSession() failure #35

Merged

Conversation

sbesson
Copy link
Member

@sbesson sbesson commented Feb 11, 2025

As described in https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.20.3, a resource is closed only if it initialized to a non-null value. In the rare event of the client.joinSession() call throwing an exception, the OmeroRequest.close() statement will never be invoked and clean all resources associated with the joinSession(). Given the current implementation of the omero.client object, this can cause an accumulation of objects within a micro-service process which would only be cleaned up by a restart.

This commit handles this scenario by catching the exception in the constructor, closing the resources and throwing a CannotCreateSessionException exception which is already part of the signature.

Testing

Given this issue is at the OmeroRequest level, it will affect all micro-services and can be tested using any of them. I used omero-ms-pixel-buffer but these instructions can be adjusted to omero-ms-image-region or omero-ms-thumbnail.

To look at the number of objects within the process, usejmap -histo:live <ms_pid> | grep omero.client.

First test a regular authentication flow,

  1. authenticate via OMERO.web and retrieve the sessionid=<sessionid> cookie associated with requested e.g. from the Developer tools
  2. create a local headers file containing Cookie: sessionid=<sessionid> locally
  3. make sure the omero-ms-pixel-buffer service is running. It might be useful to set https://github.com/glencoesoftware/omero-ms-pixel-buffer/blob/0d2976a7680c83e8b48c1946128c667d8632ecee/src/dist/conf/logback.xml#L21 at DEBUG level
  4. run a sequence of requests to the micro-service e.g.
for i in $(seq 0 100); do curl -H@headers  "http://localhost:8082/tile/<imageid>/0/0/0?x=0&y=0&w=10&h=10" -o file; done
  1. the logs should indicate the requests are being processed and the output of jmap -histo:live should not include several omero.client objects in memory as these should be cleaned up

To reproduce the resource leak, retrieve the OMERO session ID associated with the request. This can be done either by looking at the Blitz logs, reading and decoding the Redis key associated with the OMERO.web session or looking at the DEBUG statements in the micro-service logs (Successfully joined session: <omerosessionid>)

  1. terminate the session e.g. by running omero login <omerosessionid>@localhost -w <omerosessionid> && omero logout
  2. re-run the same sequence as in step 4
  3. the micro-service logs should log a Permission denied statement at DEBUG level. The output of jmap, there should be many omero.client objects in memory, as many as the number of failed requests

To test these changes, replace the omero-ms-core JAR under the lib folder of the binary with a build of this Pull Request and restart the service. Then repeat the workflow above. The behavior should be identical with the exception of the last step. The logs of the micro-service should include an ERROR level message (Failed to join session: <omerosessionid>) and the output of jmap -histo:live should be identical to step 5 i.e. the omero.client objects should no longer accumulate in memory

As described in https://docs.oracle.com/javase/specs/jls/se8/html/jls-14.html#jls-14.20.3,
a resource is closed only if it initialized to a non-null value.
In the rare event of the client.joinSession() call throwing an exception,
the OmeroRequest.close() statement will never be invoked and clean all
resources associated with the session joining operations.
Given the semantics of the omero.client object, this can cause an
accumulation of objects within a micro-service process which would only
be cleaned up by a restart.

This commit handles this scenario by catching the exception in the constructor,
closing the resources and throwing a CannotCreateSessionException exception which
is already part of the signature.
@sbesson sbesson requested review from chris-allan and kkoz February 11, 2025 16:19
@kkoz
Copy link
Member

kkoz commented Feb 12, 2025

Tested using the /pathviewer/imgData endpoint in omero-ms-image-region.
With the bug, running jmap gave me

kevin@kevin-22042:~/code/omero-ms-core$ jmap -histo:live 626120 | grep omero.client
 675:             1            368  omero.client$1

While the session was active regardless of how many requests were sent.
Then I logged out of the session and ran

for i in $(seq 0 100); do curl -H@headers "http://localhost:8081/pathviewer/imgData/2/" -o file; done;

And then jmap gave

kevin@kevin-22042:~/code/omero-ms-core$ jmap -histo:live 626120 | grep omero.client
 258:            64           3584  omero.client
 968:             1            368  omero.client$1

Note that I didn't get 100 omero.client objects and in fact could not get the number to increase past 64 regardless of how many requests were subsequently sent. So there is a window in which the bug creates extra objects but outside of that window it will not.

I think replaced the jar, restarted the microservice, and did the same thing. jmap then gave

kevin@kevin-22042:~/code/omero-ms-core$ jmap -histo:live 635354 | grep omero.client
 706:             1            368  omero.client$1

Which was expected with the bug fix.

@sbesson sbesson merged commit 643c0e3 into glencoesoftware:master Feb 13, 2025
3 checks passed
@sbesson sbesson deleted the omerorequest_joinsession_resource_leak branch February 13, 2025 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants