Skip to content

Conversation

@weirdwiz
Copy link
Contributor

@weirdwiz weirdwiz commented Jul 18, 2025

The ocs-metrics-exporter creates zombie processes when executing ceph and rbd commands to collect metrics. While we use CombinedOutput() which should wait for process completion, zombies are still accumulating in some customer environments. Adding tini as a minimal init system (via the ENTRYPOINT) ensures all zombie processes are properly reaped regardless of how they're created.

https://issues.redhat.com/browse/DFBUGS-2875

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jul 18, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: weirdwiz
Once this PR has been reviewed and has the lgtm label, please assign obnoxxx for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@weirdwiz
Copy link
Contributor Author

ptal @umangachapagain @aruniiird

@weirdwiz
Copy link
Contributor Author

docker's --init flag also uses tini as an init process to handle zombies

The metrics-exporter spawns ceph and rbd commands to collect metrics.
Without a proper init system, these child processes can become zombies
in certain container environments. Adding tini as PID 1 ensures all
zombie processes are properly reaped.

Signed-off-by: Divyansh Kamboj <[email protected]>
@weirdwiz
Copy link
Contributor Author

weirdwiz commented Jul 18, 2025

/cherry-pick release-4.18

@openshift-cherrypick-robot

@weirdwiz: once the present PR merges, I will cherry-pick it on top of release-4.18 in a new PR and assign it to you.

In response to this:

/cherry-pick release-4.18 release-4.19 release-4.17 release-4.16

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@weirdwiz
Copy link
Contributor Author

/cherry-pick release-4.19

@openshift-cherrypick-robot

@weirdwiz: once the present PR merges, I will cherry-pick it on top of release-4.19 in a new PR and assign it to you.

In response to this:

/cherry-pick release-4.19

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@weirdwiz
Copy link
Contributor Author

/cherry-pick release-4.16
/cherry-pick release-4.17

@openshift-cherrypick-robot

@weirdwiz: once the present PR merges, I will cherry-pick it on top of release-4.16, release-4.17 in new PRs and assign them to you.

In response to this:

/cherry-pick release-4.16
/cherry-pick release-4.17

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

@idryomov idryomov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we use CombinedOutput() which should wait for process completion, zombies are still accumulating in some customer environments.

Do you have any details as to what those customer environments look like? I noticed that metrics-exporter sets a 30 second timeout for all execCommand() invocations -- is there evidence of those timeouts expiring?

func execCommand(command string, args []string, timeout int) ([]byte, error) {
	var cancel context.CancelFunc
	ctx, cancel := context.WithTimeout(context.Background(), time.Duration(timeout)*time.Second)
	defer cancel()
	cmd := exec.CommandContext(ctx, command, args...)

	output, err := cmd.CombinedOutput()
        ...

I'm asking because when WithTimeout is applied to CommandContext, SIGKILL is sent only to that particular process. If command happens to spawn child processes, they normally remain unaffected.

Comment on lines +19 to +20
RUN dnf install -y tini && \
dnf clean all
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ceph containers built by RHCS don't have dnf available. They use microdnf instead, and tini isn't an installable package. This might break when using RHCS images unless the build team has a different dockerfile for this

Copy link
Contributor

@BlaineEXE BlaineEXE Jul 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an alternative, you could consider installing tini-static (tini built as a static binary) in a new builder stage and then copying /usr/bin/tini-static to the final container from the build stage. That still would have some challenges coordinating with the build team tho

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before going through the motions with tini, I'd suggest taking a harder look at whether there is a need to have a process reaper in place at all. If the issue occurs only when the timeout expires, it would likely be easier to fix the respective code in execCommand().

@leelavg
Copy link
Contributor

leelavg commented Jul 23, 2025

@weirdwiz how did you reproduce this issue? If the subprocess is becoming zombie and you are using tini only to reap them when they get re-parented, it maybe possible to just do cmd.Wait() on it (we can also disable sigchld but it may effect our whole parent process)?

As Ilya mentioned, if the subprocess is re-forking you may end up w/ orphan's. Not sure how this can be done in Go, usual way would be during exec do a setsid and kill the group (ie, negative pid) and wait, when done/timedout which will result in no zombie and no orphan.

So, if you can provide the steps on how you reproduced the issue & the leaks observed, it'll be very helpful.

thanks!

@weirdwiz
Copy link
Contributor Author

@weirdwiz how did you reproduce this issue? If the subprocess is becoming zombie and you are using tini only to reap them when they get re-parented, it maybe possible to just do cmd.Wait() on it (we can also disable sigchld but it may effect our whole parent process)?

I was not able to reproduce, it completely in a dev env. And tested out the reap with manually creating zombies.

I also suspect like @idryomov that sub process is re-forking and we're exiting out, but there was not enough hard evidence in the logs (the process exited out with -1), so I thought that the safest way to make sure we don't run into it is to setup tini

As Ilya mentioned, if the subprocess is re-forking you may end up w/ orphan's. Not sure how this can be done in Go, usual way would be during exec do a setsid and kill the group (ie, negative pid) and wait, when done/timedout which will result in no zombie and no orphan.

i think we can set cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true} to also send SIGKILL to the reforked processes

@weirdwiz
Copy link
Contributor Author

tested out reproducing the issue with adding delays in the network, also forcing issues with credentials (forcing -1 exit code)
but haven't seen defunct processes created in the system,
tini will clean up zombies, that happen to show up

@leelavg @idryomov where do you think we should proceed with this issue?

@idryomov
Copy link
Contributor

@weirdwiz Have you reproduced the timeout expiry scenario specifically?

@weirdwiz
Copy link
Contributor Author

@weirdwiz Have you reproduced the timeout expiry scenario specifically?

yes, i've reproduced that in a cluster, but that didn't lead to zombies getting created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants