Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block signal delivery to OpenBLAS-spawned threads #4054

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

peadar
Copy link

@peadar peadar commented May 24, 2023

Otherwise non-threaded applications may use signals in a way that is currently affected by OpenBLAS launching threads.

For example, it is not uncommon for an application main loop to block signals when busy, then unblock those signals while waiting for IO. (see the sigmask argument to ppoll(2))

Signals that arrive during ppoll(2) will interrupt the system call, and allow the application to handle any consequences of that signal arriving. Normally (in a single threaded process), on delivery of an externally generated signal such as SIGALRM, SIGCHLD, SIGIO the main loop will be awoken. If the thread is otherwise busy, then the signal will be maintained as pending, and will be delivered when the application next enters its idle state (eg ppoll), unblocking signals again.

OpenBLAS creates threads during initialization. Such threads inherit their signal masks from the thread that creates them, and, if that loading happens very early in the lifetime of a process, all signals are nominally unblocked in the these threads

Later, if the "main" thread is running with signals blocked when a signal is sent to the process, the kernel will deliver it to another thread in the process if it is not currently blocking that signal, in our case, to one of the OpenBLAS threads.

This means that by creating threads with open signal masks, OpenBLAS is potentially interfering with the normal operation of programs that are otherwise non-threaded.

Instead, we should block all signals before starting new threads from blas_thread_init, and then restore the signal mask as it was, so the launched threads do not participate in signal delivery and processing.

Otherwise non-threaded applications may use signals in a way that is
currently affected by OpenBLAS launching threads.

For example, it is not uncommon for an application main loop to block
signals when busy, then unblock those signals while waiting for IO.
(see the sigmask argument to `ppoll(2)`)

Signals that arrive during `ppoll(2)` will interrupt the system call,
and allow the application to handle any consequences of that signal
arriving.  Normally (in a single threaded process), on delivery of an
externally generated signal such as SIGALRM, SIGCHLD, SIGIO the main
loop will be awoken. If the thread is otherwise busy, then the signal
will be maintained as pending, and will be delivered when the
application next enters its idle state (eg `ppoll`), unblocking signals
again.

OpenBLAS creates threads during initialization. Such threads inherit
their signal masks from the thread that creates them, and, if that
loading happens very early in the lifetime of a process, all signals are
nominally unblocked in the these threads

Later, if the "main" thread is running with signals blocked when a
signal is sent to the process, the kernel will deliver it to another
thread in the process if it is not currently blocking that  signal,
in our case, to one of the OpenBLAS threads.

This means that by creating threads with open signal masks, OpenBLAS is
potentially interfering with the normal operation of programs that are
otherwise non-threaded.

Instead, we should block all signals before starting new threads from
`blas_thread_init`, and then restore the signal mask as it was, so the
launched threads do not participate in signal delivery and processing.

Signed-off-by: Peter Edwards <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant