You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
I'm deploying an HA Vault setup in our Kubernetes cluster with three replicas. While working on monitoring for the seal status of the Vault pods, I noticed that the Prometheus metrics go away when all Vault pods are sealed, which makes it impossible to trigger an alert for this state.
This apparently happens, because the vault ServiceMonitor selects the vault-active Service, which in turn selects the Vault pod with the vault-active: "true" annotation. However, when all Vault pods are sealed, then they all have the vault-active: "false" annotation, which means the Service returns 503 when the ServiceMonitor attempts to fetch metrics.
To Reproduce
Simply configure Prometheus metrics and then seal all the Vault pods by restarting them
Expected behavior
We should be able to get metrics and monitor the seal state via the vault_core_unsealed metric even when all Vault pods are sealed.
We achieved this by removing vault-active: "true" from the ServiceMonitor matchLabels field and adding a new unique label both there and to the vault Service object. This ensure the ServiceMonitor uses only the vault Service object, which routes to the Vault pods regardless of their active status.
Environment
Kubernetes version: v1.26.9-eks-a5df82a
Distribution or cloud vendor (OpenShift, EKS, GKE, AKS, etc.): EKS
Describe the bug
I'm deploying an HA Vault setup in our Kubernetes cluster with three replicas. While working on monitoring for the seal status of the Vault pods, I noticed that the Prometheus metrics go away when all Vault pods are sealed, which makes it impossible to trigger an alert for this state.
This apparently happens, because the
vault
ServiceMonitor selects thevault-active
Service, which in turn selects the Vault pod with thevault-active: "true"
annotation. However, when all Vault pods are sealed, then they all have thevault-active: "false"
annotation, which means the Service returns 503 when the ServiceMonitor attempts to fetch metrics.To Reproduce
Simply configure Prometheus metrics and then seal all the Vault pods by restarting them
Expected behavior
We should be able to get metrics and monitor the seal state via the
vault_core_unsealed
metric even when all Vault pods are sealed.We achieved this by removing
vault-active: "true"
from the ServiceMonitormatchLabels
field and adding a new unique label both there and to thevault
Service object. This ensure the ServiceMonitor uses only thevault
Service object, which routes to the Vault pods regardless of their active status.Environment
v1.26.9-eks-a5df82a
0.25.0
Chart values:
The text was updated successfully, but these errors were encountered: