Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'Apply/ReplaceResource' in resource_ops.go may leak files to '/dev/shm' since the kubectl 'apply/replace' commands never time out #572

Open
jgwest opened this issue May 4, 2024 · 0 comments · May be fixed by #573
Assignees

Comments

@jgwest
Copy link
Member

jgwest commented May 4, 2024

gitops-engine directly calls kubectl command code to create/apply/replace/delete K8s resources on the cluster. This ensures that the logic used by gitops-engine consumers (such as Argo CD) interacts with those K8s resources in a way that is compatible to kubectl.

However, at present, gitops-engine does not specify a timeout value for 'kubectl create/apply/replace' commands.

This means that in rare cases (such as cluster/network issues), the kubectl operation will remaining running forever, waiting for an I/O operation that may never complete.

Normally this would just be a small memory leak (i.e. not necessarily the end of the world), however, in order to call the kubectl command code, gitops-engine writes manifest files to '/dev/shm', which are then passed via the '-f' file option to kubectl.

This means that those long-running I/O operations are also leaking K8s manifest files to /dev/shm: the K8s manifest files must remain in '/dev/shm' while the I/O operation is in progress. '/dev/shm' appears limited to 64MB, which can fill quickly.

  • When examining the contents of /dev/shm from users that have reported this issue, we see a large number of miscellanous manifests that are hours or days old (dating back to the lasted Pod restart).

The proposed solution (PR attached) is to add a long default timeout to calls to kubectl's apply command.

Related: #568

@jgwest jgwest changed the title 'Apply/ReplaceResource' 'Apply/ReplaceResource' in resource_ops.go may leak files to '/dev/shm' if the kubectl 'apply/replace' command never times out May 4, 2024
@jgwest jgwest self-assigned this May 4, 2024
@jgwest jgwest changed the title 'Apply/ReplaceResource' in resource_ops.go may leak files to '/dev/shm' if the kubectl 'apply/replace' command never times out 'Apply/ReplaceResource' in resource_ops.go may leak files to '/dev/shm' since the kubectl 'apply/replace' commands never time out May 4, 2024
jgwest added a commit to jgwest/gitops-engine that referenced this issue May 4, 2024
@jgwest jgwest linked a pull request May 4, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant