Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Improve eXist-db response to Java heap overflow #5439

Open
djbpitt opened this issue Sep 3, 2024 · 6 comments
Open

[feature] Improve eXist-db response to Java heap overflow #5439

djbpitt opened this issue Sep 3, 2024 · 6 comments
Labels
enhancement new features, suggestions, etc.
Milestone

Comments

@djbpitt
Copy link

djbpitt commented Sep 3, 2024

Is your feature request related to a problem? Please describe.

My eXist-db (6.2.0 running on Centos 7 as a service) recently stalled with a heap overflow. I have the service configured to restart automatically, but this failure was not recognized as stopping, which means that eXist-db did not restart, and instead remained nonresponsive. My experience is limited to … er … my own experience, but I see no advantage to this behavior, and I would prefer a configuration where an eXist-db service recognizes a heap overflow and treats it like a service stop, that is, a configuration where the service restarts automatically on heap overflow. @adamretter has informed me (in the eXist-db Slack—thank you for the quick response!) that stopping on a heap overflow (which in my case would trigger a service restart) is available by setting the JAVA_OPTS environment variable to -XX:+ExitOnOutOfMemoryError, but that is not the default configuration. The point of this feature request is to ask that this setting be configured as the default.

Describe the solution you'd like
I would like JAVA_OPS=-XX:+ExitOnOutOfMemoryError to be the default configuration.

Describe alternatives you've considered
I can recover from a heap overflow by restarting manually, but that requires knowing that the heap overflow has occurred. Since the system becomes nonfunctional after a heap overflow, I see no advantage to making "remain nonfunctional until the administrator notices the problem and restarts manually" the default behavior, as currently appears to be the case.

Additional context
No additional comments or suggestions.

@djbpitt djbpitt added the enhancement new features, suggestions, etc. label Sep 3, 2024
@line-o line-o added this to the eXist-6.2.1 milestone Sep 3, 2024
@dizzzz
Copy link
Member

dizzzz commented Sep 8, 2024

@duncdrum would this make sense to add this to the docker containers too?

@duncdrum
Copy link
Contributor

duncdrum commented Sep 8, 2024

@dizzzz it already is both on the existdb/existdb and duncdrum/existdb images. Its a sensible thing to put as default .

@adamretter
Copy link
Contributor

adamretter commented Sep 8, 2024

It's not necessarily sensible to have as a default if your container or systemd unit is set to auto restart; as that could further damage a corrupted db.

@djbpitt
Copy link
Author

djbpitt commented Sep 8, 2024

@adamretter Thank you for the warning! I thought (when I proposed this enhancement) that the appropriate first action when eXist-db halted after a heap overflow was to restart it, which is what I tried manually (and successfully) when I ran into this problem recently. Perhaps I've misread the notifications, but I thought that when eXist-db was restarted after an unclean interruption it ran diagnostics and tried to correct any corruptions that might have caused the initial failure. Was restarting eXist-db the wrong thing for me to do? If so, what would be a safer strategy?

@duncdrum
Copy link
Contributor

duncdrum commented Sep 8, 2024

@adamretter its been the default since May 20, 2019 for ephemeral container images, you merged the corresponding PR. I have yet to see it not perform as intended with auto restarting containers. Do you have a reproducible error condition where corruption occurs?

@adamretter
Copy link
Contributor

you merged the corresponding PR

Yes, but that is for as you described "ephemeral containers", by default Docker does not restart your container when it exits.

Do you have a reproducible error condition where corruption occurs

No, or it would be a bug that I would fix. But as we all know, database corruption with eXist-db occurs, and when it does you don't want to compound the issue.

@adamretter adamretter modified the milestones: eXist-6.2.1, eXist-6.3.1 Oct 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new features, suggestions, etc.
Projects
None yet
Development

No branches or pull requests

5 participants