Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Bug: Turbo daemon creates / leaves a ton of <defunct> processes, accumulating enough sometimes to breach the OS-wide process limit, preventing the creation of any new processes. #9455

Open
1 task done
NullVoxPopuli opened this issue Nov 18, 2024 · 3 comments
Labels
kind: bug Something isn't working

Comments

@NullVoxPopuli
Copy link

NullVoxPopuli commented Nov 18, 2024

Verify canary release

  • I verified that the issue exists in the latest Turborepo canary release.

Link to code that reproduces this issue

I think: all turbo projects running turbo while in interactive-rebase.

This is a pretty bad bug, because MacOS only has a limit of ~ 5600 processes, and once you hit that, you can't spawn terminals, can't open apps, can't create new tabs in the browser, can't run ps, even.

You have to have already had activity monitor (or similar) open so that you can kill the turbo daemon process. Else you may be forced to reboot.

Which canary version will you have in your reproduction?

2.3.1-canary.0

Enviroment information

❯ pnpm turbo info
turbo 2.3.1-canary.0

CLI:
   Version: 2.3.1-canary.0
   Path to executable: <.pnpm>/[email protected]/node_modules/turbo-darwin-arm64/bin/turbo
   Daemon status: Running
   Package manager: pnpm9

Platform:
   Architecture: aarch64
   Operating system: macos
   WSL: false
   Available memory (MB): 10455
   Available CPU cores: 12

Environment:
   CI: None
   Terminal (TERM): alacritty
   Terminal program (TERM_PROGRAM): unknown
   Terminal program version (TERM_PROGRAM_VERSION): unknown
   Shell (SHELL): /opt/homebrew/Cellar/bash/5.2.32/bin/bash
   stdin: false

Setup, check processes:

ps -ef | grep defunct | wc -l
# 1 or 2

Normally, an OS should be around < 1000 processes:

ps -ef | wc -l
# I usually hover around 600 to 800

Scenario A (inconsistent)

  • be in interactive rebase
    (I'm splitting commits into more commits)
  • have prepare or postinstall trigger turbo's build
  • run turbo again (maybe for lint, or whatever)

Scenario B (inconsistent)

  • after changing a dependency of a package

Test:

ps -ef | grep defunct | wc -l
# 807

Test after upgrading to latest canary (noting that we run build in postinstall):

❯ ps -ef | grep defunct | wc -l
#    1435

I have an ongoing monitor for this running every second in a terminal that I just leave up all the time.

❯ watch -n 1 "echo \"All: \$(ps -ef | wc -l), Defunct: \$(ps -ef | grep defunct | wc -l)\""

And with pstree we can see that these all come from turbo

# get a list of all unique parent processes for each defunct process
❯ ps -ef | grep defunct | awk '{print $3}' | sort -u

# pass each of these to pstree
while IFS= read -r pid; do
    pstree -p $pid
done <<< $(ps -ef | grep defunct | awk '{print $3}' | sort -u)

Which will print something like this:

-+= 00001 root /sbin/launchd
 \-+= 11557 $USER /opt/homebrew/opt/borders/bin/borders
   \--- 11558 $USER <defunct>
-+= 00001 root /sbin/launchd
 \-+= 43271 $USER <.pnpm>/[email protected]/node_modules/turbo-darwin-arm64/bin/turbo --skip-infer daemon
   |--- 43359 $USER <defunct>
   |--- 43361 $USER <defunct>
   # and a few many hundred more
   \--- 57042 $USER <defunct>

Expected behavior

no defunct processes exist ever, as the OS will not halt these.

Actual behavior

defunct processes are left laying around.

To Reproduce

It's possible this is reproducible in these OSS repos:

I somewhat regularly have to kill the top level turbo daemon on Linux due to CPU usage -- but it's maybe possible that the reason for that is the same root reason that is causing me to observe the behavior that has resulted in me reporting this issue for MacOS.

In both cases, Linux (where I do most of my OSS) and Mac (where I do my closed-source employer-owned work), Killing the turbo daemon processes immediately makes any of my machines happier -- cleaning up defunct processes (macos) or freeing up cpu cycles (linux)

Additional context

No response

@NullVoxPopuli NullVoxPopuli added kind: bug Something isn't working needs: triage New issues get this label. Remove it after triage labels Nov 18, 2024
@wagenet
Copy link

wagenet commented Nov 18, 2024

We've seen this on other developer machines at my company as well.

@chris-olszewski
Copy link
Member

If either of you could share daemon logs (turbo daemon status should display the logfile) that would be helpful. We should not be spawning child processes from the daemon.

@chris-olszewski chris-olszewski removed the needs: triage New issues get this label. Remove it after triage label Nov 18, 2024
@NullVoxPopuli
Copy link
Author

NullVoxPopuli commented Nov 19, 2024

Here is what I got:

❯ pnpm turbo daemon status
# ...
✓ daemon is running
log file: <repo>/.turbo/daemon/e224a4a441d772ef-turbo.log.2024-11-19
uptime: 16m 6s 566mss
pid file: /var/folders/wk/w99lck4x7_5930c7gj65r3s40000gp/T/turbod/e224a4a441d772ef/turbod.pid
socket file: /var/folders/wk/w99lck4x7_5930c7gj65r3s40000gp/T/turbod/e224a4a441d772ef/turbod.sock
ope, big file

there is a lot of text

There was a problem saving your comment. 
Your comment is too long (maximum is 65536 characters). 
Please try again.

oops 🙈

here is a file tho

output.txt

as I was poking around in here, I noticed there was a lot of activity from watchman cookies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants