-
Notifications
You must be signed in to change notification settings - Fork 15
[FIX] Suppression of stacktrace on a shutdown #187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Wallas Santos <[email protected]>
👋 Hi! Thank you for contributing to vLLM support on Spyre.
Or this can be done with
Now you are good to go 🚀 |
bot:test |
Signed-off-by: Wallas Santos <[email protected]>
I've had a bit of experience messing with signal handlers, and the annoying problem is that setting a handler with Maybe we can use a similar strategy here though, to setup a handler that calls the vllm one, plus whatever is already registered:
However, if we think that the one set by the g3log is not required for proper shutdown, then we should probably just override it to the vllm one and call it a day |
I did some investigation. As I understand it, Python's So I think the change here to override the signal handler is a good fix. Also, the change to the handler happens after the warmup, so the g3log handler is still be active during the compliation phase when we would want to have the low-level stack trace if there is a crash. |
Thanks @joerunde for the feedback and suggestions. Pretty cool ideas.
I guess it is more like that. What I observed is that g3log is overriding the signals, the original signal handler of vllm is completely overwritten and it is never called, so I pretty sure that would do to any signals as well... I wrote this PR more thinking as workaround than an ideal solution. I think an ideal solution would be setup g3log to not override other signals. Unfortunately, I did not saw in g3log docs or code ways to setup the signal handler indirectly like using environment variables. It looks like this should be done by the application itself. While I appreciate your solution, I don't think we'll have extra benefits, because g3log have already clean up everything. |
@tjohnson31415 just share some thoughts at the same time as me 😄 . Thanks! |
vllm_spyre/envs.py
Outdated
# vLLM V1 + torch_sendnn backend to be able to gracefully | ||
# shutdown the engine. | ||
"VLLM_SPYRE_OVERRIDE_SIGNALS_HANDLER": | ||
lambda: os.getenv("VLLM_SPYRE_OVERRIDE_SIGNALS_HANDLER", "0"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently setting the value of VLLM_SPYRE_OVERRIDE_SIGNALS_HANDLER
to the string "0" which is truthy, and it would require an empty env var for a falsey value.
Also, the default should be to have a graceful shutdown, i.e. enable the override.
lambda: os.getenv("VLLM_SPYRE_OVERRIDE_SIGNALS_HANDLER", "0"), | |
lambda: bool(int(os.getenv("VLLM_SPYRE_OVERRIDE_SIGNALS_HANDLER", "1"))), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
I set first disabled as default, because the libraries that installed g3log might need it to dump their log. However, there are other signals that we do not touch that might happen in runtime errors. So for the SIGTERM and SIGINT, which are for graceful shutdown, the override would be fine.
Signed-off-by: Wallas Santos <[email protected]>
Signed-off-by: Wallas Santos <[email protected]>
Co-authored-by: Travis Johnson <[email protected]> Signed-off-by: Wallas Santos <[email protected]>
18ff336
to
65664f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR took me down a bit of a rabbit hole on signal handling 😅
Great fix though. LGTM
Ah, yeah TIL about the C++ vs Python signal handling 🤦 LGTM! |
This PR adds a workaround to suppress logs that are dumped at the shutdown of the engine (only on V1) when vllm runs with multiprocess. The undesired behavior happens because g3log from Spyre stack/runtime overrides the signal handler from vLLM when it starts a process for the engine code. Therefore, the engine does not have a chance to gracefully shutdown.