Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

I propose exposing the pidfd instance for Tokio-spawned child processes on Linux #6980

Closed

Conversation

ChrysoliteAzalea
Copy link

Hello, everyone! I propose giving the tokio::process::Child client code access to the underlying pidfd instance on Linux.

Motivation

Pidfd instances on Linux are file descriptors referring to a process (either running or terminated). Tokio uses these file descriptors internally to receive notifications when Tokio-spawned child process has terminated. If pidfd is not supported, Tokio falls back to checking all watched child processes on receiving the SIGCHLD signal. However, watching and waiting for child processes is not the only use-case for pidfd, it can be used to send arbitrary signals to a child (pidfd_send_signal()) if you have signalling rights, intervene with its operation (process_madvise(), pidfd_getfd()) if you have ptrace rights, entering child's namespace (setns()). Future versions of Linux may add other use-cases for the pidfd instances.

Solution

I propose adding a new unsafe method on tokio::process::Child structure -- the get_pidfd() that returns a pidfd instance as long as it's supported by the operating system (this is obviously a Linux-only feature). It's unsafe because closing it may break the tokio::process::Child operation, therefore, the client code should probably use something like std::os::fd::BorrowedFd to wrap it.

@Darksonn Darksonn added A-tokio Area: The main tokio crate M-process Module: tokio/process labels Nov 18, 2024
@Darksonn Darksonn requested a review from ipetkov November 18, 2024 09:19
@Darksonn
Copy link
Contributor

I'm concerned about the forwards-compatibility impact of this change. If we expose this method, then we can never change the implementation to use something different than pidfd internally. And it's not been very long since we last changed the internals to use pidfd; we used to use a signal handler instead.

@ipetkov
Copy link
Member

ipetkov commented Nov 18, 2024

Thanks for the PR @ChrysoliteAzalea! I share @Darksonn 's concerns here and I'm generally opposed to this direction.

To say it in my own words:

  • The pidfd implementation is fairly new and is definitely not the standard: we'll fall back to signal handling if pidfd is not available or otherwise does not work
  • Exposing the pidfd publicly like this has compatibility hazards in making it harder to change things in the future without "breaking changes": even though the proposed API addition would return an Option<ProcessHandle>, if we were to suddenly remove the pidfd implementation outright anyone who relies on it would see it as a breaking change (we can argue whether or not this is "legal" as per semver rules, but at the end of the day, its still an impactful change)
  • Even if the API is marked as unsafe, there is still the challenge in documenting the exact invariants that must be upheld, esp if we evolve what the internals are performing and how an outside caller could mess with them by accessing the raw handle directly. This also pushes us in a direction where callers may need to be deeply familiar with Tokio internals which starts feeling as a leaky design
  • Being able to send arbitrary signals, intervening with the process internals, or interacting with the child's namespace all sound very appealing, but at the end of the day, those are platform-specific implementations (which also may or may not be available depending on what kernel version is in use) and I'm not sure if Tokio needs to strive to cater to all those use cases. Tokio aims to be very stable and we're generally conservative at stabilizing new APIs or baseline requirements (whether that's the compiler toolchain or minimum supported OS version). For something this platform-specific and low-level, it might be better implemented in a separate crate focused entirely on pidfd mechanics (esp ones with a large API surface area)

@ChrysoliteAzalea
Copy link
Author

ChrysoliteAzalea commented Nov 18, 2024

@Darksonn @ipetkov Thank you for your replies! If holding the pidfd instance is seen as an implementation detail, then I'm not sure whether exposing it would be appropriate. I've just seen that std::process::Child holds pidfd internally (on Linux, if asked during construction with create_pidfd() on std::process::Command instance), and that it's exposed to users. If Tokio uses pidfd as just a way to poll/wait/kill the process (rather than storing a handle that can be freely given out), then it can be more appropriate if users that need a pidfd instance acquired it by other means (such as explicitly calling pidfd_open(), or use std::process::Child and then register the pidfd with tokio::io::unix::AsyncFd.

there is still the challenge in documenting the exact invariants that must be upheld, esp if we evolve what the internals are performing and how an outside caller could mess with them by accessing the raw handle directly.

As far as I understand, users should abstain from closing the pidfd, re-seating it with dup2() or dup3() and waiting the process. It's similar to I/O safety requirements.

Being able to send arbitrary signals, intervening with the process internals, or interacting with the child's namespace all sound very appealing, but at the end of the day, those are platform-specific implementations (which also may or may not be available depending on what kernel version is in use) and I'm not sure if Tokio needs to strive to cater to all those use cases. Tokio aims to be very stable and we're generally conservative at stabilizing new APIs or baseline requirements (whether that's the compiler toolchain or minimum supported OS version). For something this platform-specific and low-level, it might be better implemented in a separate crate focused entirely on pidfd mechanics (esp ones with a large API surface area)

I understand, given Tokio already has tokio::io::unix::AsyncFd that can work with pidfd instances acquired by other means.

@Darksonn
Copy link
Contributor

For now I will close this PR. If the standard library stabilizes create_pidfd we can reconsider.

@Darksonn Darksonn closed this Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate M-process Module: tokio/process
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants