You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After upgrading to 0.5.0, we've noticed a strange behaviour where our Kafka consumer, launched in a separate coroutine, was abruptly stopped few seconds after startup. What's worth noticing, the application itself was not terminated or crashed. The kafka coroutine is running within the same scope of the suspended app and it is polling for messages as long as the jobs is active. Initially, it seemed like simply wrapping it in a new scope would solve the issue - and it did, as the consumer was successfully polling and handling the messages. However, after a deeper investigation to find the root cause of the issue we noticed that the application would never stop, regardless of what was put in the SuspendApp lambda. Given this extremely simple example:
funmain() =SuspendApp { logger.info { "The app is running!" } }
We discovered a significant change in the top-level coroutine launched in the SuspendApp, where after executing the lambda block, the process is being exited with status code 0:
This call triggers the sequence of JVM shutdown hooks, which then executes env.OnShutdown lambda defined in unregister variable, where the job is being cancelled (and that explains why our kafka consumer was suddenly stopped and why running it in a new scope allowed the consumer to proceed). The process then waits for the cancelled job to complete, but that never happens - the application basically hangs, unless a finite timeout duration is specified, then it simply terminates with timeout exception.
My question is - what is the rationale behind exiting the process after lambda execution instead of letting the job complete and eventually calling unregister() as it was done in version 0.4.0? Right now it doesn't seem like the unregistering logic is ever executed, as the exit(0) method is called first.
If this behaviour is expected, how should the logic of SuspendApp be defined to make sure that the application actually terminates at some point?
Thanks in advance for your help, for now we're skipping the upgrade of the library until we get some more insight 😄
The text was updated successfully, but these errors were encountered:
There was a small refactor in the codebase to include system.exit at the right moments. This was needed for some cloud providers that required explicit system.exit to close pods. To guarantee the same implementation for all platforms, a change was made in the KMP structure.
However, this should absolutely not happen! I am investigating atm, and will report back asap.
Also, any ideas or tips to properly test this automatically would be awesome.
Ported from arrow-kt/suspendapp#140, issue by @weronkagolonka
Hello!
After upgrading to
0.5.0
, we've noticed a strange behaviour where our Kafka consumer, launched in a separate coroutine, was abruptly stopped few seconds after startup. What's worth noticing, the application itself was not terminated or crashed. The kafka coroutine is running within the same scope of the suspended app and it is polling for messages as long as the jobs is active. Initially, it seemed like simply wrapping it in a new scope would solve the issue - and it did, as the consumer was successfully polling and handling the messages. However, after a deeper investigation to find the root cause of the issue we noticed that the application would never stop, regardless of what was put in theSuspendApp
lambda. Given this extremely simple example:We discovered a significant change in the top-level coroutine launched in the
SuspendApp
, where after executing the lambda block, the process is being exited with status code 0:This call triggers the sequence of JVM shutdown hooks, which then executes
env.OnShutdown
lambda defined inunregister
variable, where the job is being cancelled (and that explains why our kafka consumer was suddenly stopped and why running it in a new scope allowed the consumer to proceed). The process then waits for the cancelled job to complete, but that never happens - the application basically hangs, unless a finite timeout duration is specified, then it simply terminates with timeout exception.My question is - what is the rationale behind exiting the process after lambda execution instead of letting the job complete and eventually calling
unregister()
as it was done in version0.4.0
? Right now it doesn't seem like the unregistering logic is ever executed, as theexit(0)
method is called first.If this behaviour is expected, how should the logic of
SuspendApp
be defined to make sure that the application actually terminates at some point?Thanks in advance for your help, for now we're skipping the upgrade of the library until we get some more insight 😄
The text was updated successfully, but these errors were encountered: