Add "sleep" feature to the Docker image as a separate binary or as an argument #656
Labels
community
Issues or PRs opened by an external contributor
help wanted
Issues identified as good community contribution opportunities
refined
Issues that are ready to be prioritized
Is your feature request related to a problem? Please describe.
I'm trying to start using the exporter close to how it is described in this article. I launch the Exporter as a sidecar container to Nginx in a Kubernetes Pod.
But I have a problem. The Nginx in my setup is also a sidecar to the backend container. And I use the preStop container lifecycle hook. It's a simple "exec" command that runs "sleep". This allows to mitigate some 5xx errors for the end-users.
I tried to configure a similar preStop hook for the Exporter. Unfortunately, there's no binary in the Docker image that I can call to run the
sleep X
command. I wanted to use something like this:This leads to a problem that Kubernetes may kill the Exporter container before the Nginx container. And a portion of very important metrics will never be exported to the monitoring system. A lot of corner cases appear when a Kubernetes Pod goes down and a new one starts as a replacement. Accurate monitoring is crucial to debug such cases.
The Kubernetes developers introduced a Feature Gate called
PodLifecycleSleepAction
, which is described here and its goal is basically to replicate thatsleep
command. The problem is that the Feature Gate current status isalpha
and it's available since Kubernetes 1.29. Cloud platforms, such as AWS, don't allow Alpha Gates in their Kubernetes implementations. It can take a year or more for this feature to land into the EKS world.Describe the solution you'd like
It would be very nice if the "sleep" feature was included into the Docker image of
nginx-prometheus-exporter
either as a separate command, or as a part of the binary itself, i.e. an argumentnginx-prometheus-exporter --sleep 10
that would simply return the 0 exit code after the sleep.Describe alternatives you've considered
As mentioned above, the
PodLifecycleSleepAction
is the best alternative. Although, it will take too long for the feature to become available in some production environments with long update lifecycles. In my case we aren't even close to Kubernetes 1.29.Additional context
–
The text was updated successfully, but these errors were encountered: