title | draft | date |
---|---|---|
Adopters |
false |
2021-03-08T23:50:39+01:00 |
This document tracks people and use cases for the Prometheus Operator in production. By creating a list of production use cases we hope to build a community of advisors that we can reach out to with experience using various the Prometheus Operator applications, operation environments, and cluster sizes. The Prometheus Operator development team may reach out periodically to check-in on how the Prometheus Operator is working in the field and update this list.
Go ahead and add your organization to the list.
Environments: Bare Metal, Opennebula
Uses kube-prometheus: Yes
Details:
- multiple K8s cluster with prometheus deployed through prom-operator
- several own ceph cluster providing metrics via ceph mgr prometheus module
- several customer ceph clusters pushing metrics via external pushgateway to our our central monitoring instances
- thanos receiver connected to own S3 storage
Environments: AWS
Uses kube-prometheus: Yes
Details:
- Operator installed on each Kubernetes cluster, with Thanos aggregating metrics from a central query endpoint
- Two Prometheus instances per cluster
- Loose coupling between Kubernetes cluster administrators who manage alerting sinks and service owners who define alerts for their services
- 800K samples/s
- 30M active series
Environments: AWS, Azure, Bare Metal
Uses kube-prometheus: Yes (with additional tight Giant Swarm integrations)
Details:
- One prometheus operator per management cluster and one prometheus instance per workload cluster
- Customers can also install kube-prometheus for their workload using our App Platform
- 760000 samples/s
- 35M active series
Environments: Google Cloud
Uses kube-prometheus: Yes (with additional Gitpod mixins)
Details:
- One prometheus instance per cluster (8 so far)
- 20000 samples/s
- 1M active series
Environments: AWS, Azure
Uses kube-prometheus: Yes
Details (optional):
- multiple remote K8s cluster in which we have prometheus deployed through prom-operator.
- these remote prometheus instances push cluster metrics to central Thanos receiver which is connected to S3 storage.
- on top of Thanos we have Grafana for dashboarding and visualisation.
https://kinvolk.io/lokomotive-kubernetes/
Environments: AKS, AWS, Bare Metal, Equinix Metal
Uses kube-prometheus: Yes
Details:
- Self-hosted (control plane runs as pods inside the cluster)
- Deploys full K8s stack (as a distro) or managed Kubernetes (currently only AKS supported)
- Deployed by Kinvolk for its own hosted infrastructure (including Flatcar Container Linux update server), as well as by Kinvolk customers and community users
Environments: AWS
Uses kube-prometheus: Yes
Details:
- One prometheus operator in our platform cluster and one prometheus instance per workload cluster
- 17k samples/s
- 841k active series
Environments: AWS
Uses kube-prometheus: Yes
Details:
- All Mattermost clusters use the Prometheus Operator with Thanos sidecar for cluster monitoring and central Thanos query component to gather all data.
- 977k samples/s
- 29.4M active series
Environment: Google Cloud
Uses kube-prometheus: Yes
Details:
- 100k samples/s
- 1M active series
Environments: AWS, Azure, Google Cloud, Bare Metal
Uses kube-prometheus: Yes (with additional tight OpenShift integrations)
This is a meta user; please feel free to document specific OpenShift users!
All OpenShift clusters use the Prometheus Operator to manage the cluster monitoring stack as well as user workload monitoring. This means the Prometheus Operator's users include all OpenShift customers.
Environments: AWS, Google Cloud
Uses kube-prometheus: No
Opstrace installations use the Prometheus Operator internally to collect metrics and to alert. Opstrace users also often use the Prometheus Operator to scrape their own aplications and remote_write those metrics to Opstrace.
Environment: Google Cloud
Uses kube-prometheus: Yes
Details:
- HA Pair of Prometheus
- 4000 samples/s
- 100k active series
Environments: EKS, GKE, AKS, and self-hosted Kubernetes
Uses kube-prometheus: Yes
We're an open source project that builds upon the awesome Prometheus Operator. We run automated playbooks in response to Prometheus alerts and other events in your cluster. For example, you can automatically fetch logs and send them to Slack when a Prometheus alert occurs. All it takes is this YAML:
triggers:
- on_prometheus_alert:
alert_name: KubePodCrashLooping
actions:
- logs_enricher: {}
sinks:
- slack
Environment: AWS
Uses kube-prometheus: Yes
Details (optional):
- HA Pairs of Prometheus
- 25000 samples/s
- 1.2M active series
suse.com/products/suse-rancher
Environments: RKE, RKE2, K3s, Windows, AWS, Azure, Google Cloud, Bare Metal, etc.
Uses kube-prometheus: Yes
Rancher Monitoring supports use cases for Prometheus Operator across various different cluster types and setups that are managed via the Rancher product. All Rancher users that install Monitoring V2 deploy this chart.
For more information, please see how Rancher monitoring works.
The open-source rancher-monitoring Helm chart (based on kube-prometheus-stack) can be found at rancher/charts.
Environments: Bare Metal
Uses kube-prometheus: Yes
Details (optional):
- HA Pair of Prometheus
- 517000 samples/s
- 10.7M active series
Environments: AWS, Azure, Google Cloud, cloudscale.ch, Exoscale, Swisscom
Uses kube-prometheus: Yes
Details (optional):
- A huge fleet of OpenShift and Kubernetes clusters, each using Prometheus Operator
- All managed by Project Syn, leveraging Commodore Components like component-rancher-monitoring which re-uses Prometheus Operator
Environments: Kubernetes, AWS (via some EC2)
Uses kube-prometheus: No
Details (optional):
- About 30 HA pairs of sharded Promethei across 10 environments, wired together with Thanos
- Operator also helps us seamlessly manage anywhere between 600-1500 short-lived prometheus instances for our "integration" kubernetes cluster.
- ~15mn samples/s
- ~200mn active series
Environments: AWS, Azure, Google Cloud, Bare Metal, etc
Uses kube-prometheus: Yes | No
Details (optional):
- HA Pair of Prometheus
- 1000 samples/s (query:
rate(prometheus_tsdb_head_samples_appended_total[5m])
) - 10k active series (query:
prometheus_tsdb_head_series
)