Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(example,metrics): kube-state-metrics to monitor custom resource … #10277

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sebastiangaiser
Copy link
Contributor

@sebastiangaiser sebastiangaiser commented Jun 28, 2024

…state

In order to monitor the state of custom resources (CR) inside of the Kubernetes cluster, kube-state-metrics can be deployed. This describes the deployment using the prometheus-community Helm chart.

Issue: #10276

Type of change

Select the type of your PR

  • Enhancement / new feature

Description

In order to monitor the state of custom resources (CR) inside of the Kubernetes cluster, kube-state-metrics can be deployed. This describes the deployment using the prometheus-community Helm chart.

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

…state

In order to monitor the state of custom resources (CR) inside of the Kubernetes cluster, kube-state-metrics can be deployed. This describes the deployment using the prometheus-community Helm chart.

Issue: strimzi#10276
Signed-off-by: Sebastian Gaiser <[email protected]>
@scholzj scholzj linked an issue Jun 28, 2024 that may be closed by this pull request
Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. I think this is a good idea. I think we should consider some additional things here:

  • Do we want to document this? (I guess we have some notes on what is int he examples, so we should mention it?)
  • Do we want to have some no-Helm variant of this in the examples? As we in general have a pure YAML based installation as the primary source and as many users do not use Helm, so this useful only for some users.
  • Should we provide this for all custom resources and deprecated (and later remove) the state metrics provided by Strimzi CO? (this might not be a task for this PR, but it should be considered when adopting this)
  • Do we want to have some System Test coverage?

CC @strimzi/maintainers

@sebastiangaiser
Copy link
Contributor Author

Should we provide this for all custom resources

I think this would be a good idea. E.g. reconcile for a Kafka resource fails, a KafkaRebalance has problems, ... But I didn't had any problems, yet. Neither I'm not sure which fields are relevant for all resources.

Do we want to have some no-Helm variant of this in the examples?

I think this depends on how you would like to deploy the monitoring. The Flux example uses the kube-prometheus-stack and injects the values there (into the Helm release). Personally I'm not a big fan of mixing my monitoring stack with some Flux/Strimzi/... specific monitoring. So I ended up deploying a second ksm for Flux and Strimzi (yes I use Flux).

@scholzj
Copy link
Member

scholzj commented Jun 29, 2024

I think this depends on how you would like to deploy the monitoring. The Flux example uses the kube-prometheus-stack and injects the values there (into the Helm release). Personally I'm not a big fan of mixing my monitoring stack with some Flux/Strimzi/... specific monitoring. So I ended up deploying a second ksm for Flux and Strimzi (yes I use Flux).

Well, the Prometheus part seems to be just a custom resource that can have its own YAMl as well. I also assume that the confguration will end up in some ConfigMap or some environment variables to configure the Kube State Metrics? So that can be described or stored in separate YAMLs.

I think this would be a good idea. E.g. reconcile for a Kafka resource fails, a KafkaRebalance has problems, ... But I didn't had any problems, yet. Neither I'm not sure which fields are relevant for all resources.

I guess a start might be to replicate what the Cluster operator does -> a metric to indicate if the resource is ready or not. That would allow us to drop it from the Cluster operator. We can improve things later as some ideas pop-out. But we can wait for this for some discussion and have others chime in with their thoughts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Enhancement]: Monitoring of custom resources
2 participants