Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make jobserver hangs after invoking sccache if server isn't already spawned #2145

Open
64 opened this issue Apr 6, 2024 · 2 comments
Open

Comments

@64
Copy link

64 commented Apr 6, 2024

Minimal reproducer: (sccache v0.7.7, GNU make v4.4.1, cargo v1.78.0-nightly 194a60b29)

$ printf 'all:\n\tsccache cc main.c' > Makefile
$ printf 'int main(){}' > main.c
$ sccache --stop-server
$ make -j2
# <----- hangs here!

WORKAROUND: Simply sccache --start-server before invoking make. Alternatively, you can invoke make with --jobserver-style=pipe.

@64
Copy link
Author

64 commented Apr 7, 2024

Looking into this a bit further. I'm using strace -f -yY -o log.txt make -j2 to see what's going on. Grepping for the jobserver fifo ("GMfifo") shows all operations on it: (note the 'finished' parts of some syscalls are not shown due to strace interleaving output)

log with sccache

13767<make> mknodat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo13767", S_IFIFO|0600) = 0
13767<make> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo13767", O_RDONLY|O_NONBLOCK) = 3</tmp/GMfifo13767>
13767<make> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo13767", O_WRONLY) = 4</tmp/GMfifo13767>
13767<make> fcntl(3</tmp/GMfifo13767>, F_GETFD) = 0
13767<make> fcntl(3</tmp/GMfifo13767>, F_SETFD, FD_CLOEXEC) = 0
13767<make> fcntl(4</tmp/GMfifo13767>, F_GETFD) = 0
13767<make> fcntl(4</tmp/GMfifo13767>, F_SETFD, FD_CLOEXEC) = 0
13767<make> write(4</tmp/GMfifo13767>, "+", 1) = 1
13767<make> fcntl(3</tmp/GMfifo13767>, F_GETFL) = 0x8800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE)
13767<make> fcntl(3</tmp/GMfifo13767>, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
13768<cargo> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo13767", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo13767>
13770<sccache> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo13767", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo13767>
13789<sccache> openat(AT_FDCWD</>, "/tmp/GMfifo13767", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo13767>
13839<rustc> openat(AT_FDCWD</>, "/tmp/GMfifo13767", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo13767>
13842<rustc> openat(AT_FDCWD</>, "/tmp/GMfifo13767", O_RDWR|O_CLOEXEC <unfinished ...>
13842<rustc> <... openat resumed>)      = 3</tmp/GMfifo13767>
13843<rustc> openat(AT_FDCWD</>, "/tmp/GMfifo13767", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo13767>
13846<rustc> read(3</tmp/GMfifo13767>,  <unfinished ...>
13847<coordinator> write(3</tmp/GMfifo13767>, "+", 1 <unfinished ...>
13852<rustc> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo13767", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo13767>
13770<sccache> close(3</tmp/GMfifo13767>) = 0
13857<sccache> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo13767", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo13767>
13874<rustc> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo13767", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo13767>
13876<rustc> read(3</tmp/GMfifo13767>,  <unfinished ...>
13876<rustc> read(3</tmp/GMfifo13767>,  <unfinished ...>
13877<coordinator> write(3</tmp/GMfifo13767>, "+", 1 <unfinished ...>
13877<coordinator> write(3</tmp/GMfifo13767>, "+", 1 <unfinished ...>
13876<rustc> read(3</tmp/GMfifo13767>,  <unfinished ...>
13876<rustc> read(3</tmp/GMfifo13767>,  <unfinished ...>
13877<coordinator> write(3</tmp/GMfifo13767>, "+", 1) = 1
13877<coordinator> write(3</tmp/GMfifo13767>, "+", 1 <unfinished ...>
13876<rustc> read(3</tmp/GMfifo13767>,  <unfinished ...>
13877<coordinator> write(3</tmp/GMfifo13767>, "+", 1 <unfinished ...>
13876<rustc> read(3</tmp/GMfifo13767>,  <unfinished ...>
13877<coordinator> write(3</tmp/GMfifo13767>, "+", 1 <unfinished ...>
13876<rustc> read(3</tmp/GMfifo13767>,  <unfinished ...>
13877<coordinator> write(3</tmp/GMfifo13767>, "+", 1 <unfinished ...>
13857<sccache> close(3</tmp/GMfifo13767>) = 0
13767<make> fcntl(3</tmp/GMfifo13767>, F_GETFL) = 0x8800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE)
13767<make> fcntl(3</tmp/GMfifo13767>, F_SETFL, O_RDONLY|O_LARGEFILE) = 0
13767<make> close(4</tmp/GMfifo13767>)  = 0
13767<make> read(3</tmp/GMfifo13767>, "+", 1) = 1
13767<make> read(3</tmp/GMfifo13767>,

log without sccache

15030<make> mknodat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo15030", S_IFIFO|0600) = 0
15030<make> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo15030", O_RDONLY|O_NONBLOCK) = 3</tmp/GMfifo15030>
15030<make> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo15030", O_WRONLY) = 4</tmp/GMfifo15030>
15030<make> fcntl(3</tmp/GMfifo15030>, F_GETFD) = 0
15030<make> fcntl(3</tmp/GMfifo15030>, F_SETFD, FD_CLOEXEC) = 0
15030<make> fcntl(4</tmp/GMfifo15030>, F_GETFD) = 0
15030<make> fcntl(4</tmp/GMfifo15030>, F_SETFD, FD_CLOEXEC) = 0
15030<make> write(4</tmp/GMfifo15030>, "+", 1) = 1
15030<make> fcntl(3</tmp/GMfifo15030>, F_GETFL) = 0x8800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE)
15030<make> fcntl(3</tmp/GMfifo15030>, F_SETFL, O_RDONLY|O_NONBLOCK|O_LARGEFILE) = 0
15031<cargo> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo15030", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo15030>
15033<rustc> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo15030", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo15030>
15038<rustc> openat(AT_FDCWD</home/matt/code/tmp/example>, "/tmp/GMfifo15030", O_RDWR|O_CLOEXEC) = 3</tmp/GMfifo15030>
15040<rustc> read(3</tmp/GMfifo15030>,  <unfinished ...>
15040<rustc> read(3</tmp/GMfifo15030>,  <unfinished ...>
15041<coordinator> write(3</tmp/GMfifo15030>, "+", 1 <unfinished ...>
15041<coordinator> write(3</tmp/GMfifo15030>, "+", 1 <unfinished ...>
15040<rustc> read(3</tmp/GMfifo15030>,  <unfinished ...>
15040<rustc> read(3</tmp/GMfifo15030>,  <unfinished ...>
15041<coordinator> write(3</tmp/GMfifo15030>, "+", 1) = 1
15041<coordinator> write(3</tmp/GMfifo15030>, "+", 1 <unfinished ...>
15040<rustc> read(3</tmp/GMfifo15030>,  <unfinished ...>
15041<coordinator> write(3</tmp/GMfifo15030>, "+", 1 <unfinished ...>
15040<rustc> read(3</tmp/GMfifo15030>,  <unfinished ...>
15041<coordinator> write(3</tmp/GMfifo15030>, "+", 1 <unfinished ...>
15040<rustc> read(3</tmp/GMfifo15030>,  <unfinished ...>
15041<coordinator> write(3</tmp/GMfifo15030>, "+", 1 <unfinished ...>
15030<make> fcntl(3</tmp/GMfifo15030>, F_GETFL) = 0x8800 (flags O_RDONLY|O_NONBLOCK|O_LARGEFILE)
15030<make> fcntl(3</tmp/GMfifo15030>, F_SETFL, O_RDONLY|O_LARGEFILE) = 0
15030<make> close(4</tmp/GMfifo15030>)  = 0
15030<make> read(3</tmp/GMfifo15030>, "+", 1) = 1
15030<make> read(3</tmp/GMfifo15030>, "", 1) = 0
15030<make> close(3</tmp/GMfifo15030>)  = 0
15030<make> unlink("/tmp/GMfifo15030")  = 0

It seems like the tokens are being correctly returned to the jobserver in both cases, but make never sees the read() call return 0 because one of the sccache processes kept the file open (count 3 open calls vs 2 close calls). lsof /tmp/GMfifo15030 confirms this.

@64
Copy link
Author

64 commented Apr 7, 2024

It seems to be caused by the relatively new feature of GNU Make (>4.3.90) where the jobserver communication is done by a named FIFO (--jobserver-auth=fifo:/tmp/GMfifoXXXX)(commit, docs) rather than opening a pipe (--jobserver-auth=R,W).

Indeed, passing make --jobserver-style=pipe causes the reproducer in OP to exit successfully, whereas --jobserver-style=fifo hangs. Cargo seems to manage its jobserver via pipes, so issue only appears when make launches sccache, or indirectly launches it via cargo, and forces everything to used a named fifo.

It's not obvious what the right fix is though. Spawning the server daemon with the context of a jobserver sounds inherently broken to me. Instead, shouldn't the server act as if it was spawned from nothing, and clients pass their jobserver info for each compile request they make to the server?

@64 64 changed the title make jobserver hangs when invoking cargo + sccache make jobserver hangs after invoking sccache if server isn't already spawned Apr 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant