-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #1173: problems stopping instances #1178
Conversation
…factor stop/signal handling routines
I've worked on this code before and it still took me a long time to re-figure out why it was doing that.
Test Results247 tests 245 ✅ 1m 21s ⏱️ For more details on these failures, see this check. Results for commit 56d2be7. |
The reason why I'm wondering if this has improved the unacked message situation in flakey_broker is because the polls were being killed previously (case 2 above):
And now they are not being killed, because of the interruptible_sleep, so there's an improvement there. I just not sure if it has actually fixed the unacked messages problem, it still could be coincidental that I've never got unacked messages with the changes on this branch. Waiting for GitHub Actions flow tests to finish to have a few more flakey runs. |
And GitHub Actions did give a counter-example that proves me wrong :( https://github.com/MetPX/sarracenia/actions/runs/10493296674/job/29066922812?pr=1178 Here the polls never got killed, but there were still waiting unacked and ready messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It took me a while to understand, but this change looks great!
I ran the flow tests on 24.04, and it does seem to run much more cleanly.
There isn't the same meandering about with termination of the flakey test.
I think it's a big improvement.
This fixes multiple problems:
flow/__init__.py
works in instance 1 and cleanly stops the instance, but the signal handler inmoth/amqp.py
takes over in instances >1. The one inflow/__init__.py
never executes, so the instance never shuts itself down. This forcessr3 stop
to always kill instances > 1.Only one signal handler at a time can be used, so whichever one was registered last was being called.
please_stop
callbacks were not being called, except in two rare cases (when messageCountMax is enabled and the max. message count is reached; or in configs where sleep is <0) (regression from please stop infinite loop by calling self #1070).Now please_stop callbacks are called when SIGTERM or SIGINT is received.
interruptible_sleep
function, a sleep that can be interrupted a variable becoming True.I'm not 100% sure it actually fixed the problem, but I did multiple runs of flakey_broker and never saw any unacked messages... it could be a coincidence.