Skip to content

Commit

Permalink
Service latency measurement (kube-burner#516)
Browse files Browse the repository at this point in the history
## Type of change

- [ ] Refactor
- [x] New feature
- [ ] Bug fix
- [ ] Optimization
- [x] Documentation Update

## Description

Service ready latency measurement, please take a look at the updated
docs to get more information about this feature


![image](https://github.com/cloud-bulldozer/kube-burner/assets/4614641/566df440-8961-4de1-a41e-faee4c40074e)


## Related Tickets & Documents

- Closes kube-burner#467 

## Checklist before requesting a review

- [x] I have performed a self-review of my code.
- [x] If it is a core feature, I have added thorough tests.

---------

Signed-off-by: Raul Sevilla <[email protected]>
  • Loading branch information
rsevilla87 authored Jan 30, 2024
1 parent 0a8b0d5 commit 99e0b8e
Show file tree
Hide file tree
Showing 28 changed files with 913 additions and 220 deletions.
4 changes: 2 additions & 2 deletions cmd/kube-burner/kube-burner.go
Original file line number Diff line number Diff line change
Expand Up @@ -169,8 +169,8 @@ func destroyCmd() *cobra.Command {
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
labelSelector := fmt.Sprintf("kube-burner-uuid=%s", uuid)
burner.CleanupNamespaces(ctx, labelSelector)
burner.CleanupNonNamespacedResources(ctx, labelSelector)
util.CleanupNamespaces(ctx, clientSet, labelSelector)
util.CleanupNonNamespacedResources(ctx, clientSet, burner.DynamicClient, labelSelector)
},
}
cmd.Flags().StringVar(&uuid, "uuid", "", "UUID")
Expand Down
118 changes: 107 additions & 11 deletions docs/measurements.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,7 @@

Kube-burner allows you to get further metrics using other mechanisms or data sources, such as the Kubernetes API. These mechanisms are called measurements.

Measurements are enabled in the measurements section of the configuration file. This section contains a list of measurements with their options.
'kube-burner' supports the following measurements so far:

!!! Warning
`podLatency`, as any other measurement, is only captured during a benchmark runtime. It does not work with the `index` subcommand of kube-burner
Measurements are enabled in the `measurements` object of the configuration file. This object contains a list of measurements with their options.

## Pod latency

Expand All @@ -17,7 +13,9 @@ Collects latencies from the different pod startup phases, these **latency metric
- name: podLatency
```
This measurement sends its metrics to a configured indexer. The metrics collected are pod latency histograms (`podLatencyMeasurement`) and four documents holding a summary with different pod latency quantiles of each pod condition (`podLatencyQuantilesMeasurement`). It is possible to skip indexing the `podLatencyMeasurement` metric by configuring the field `podLatencyMetrics` of this measurement to `quantiles`.
### Metrics
The metrics collected are pod latency timeeries (`podLatencyMeasurement`) and four documents holding a summary with different pod latency quantiles of each pod condition (`podLatencyQuantilesMeasurement`). It's possible to skip indexing the `podLatencyMeasurement` metric by configuring the field `podLatencyMetrics` of this measurement to `quantiles`.

One document, such as the following, is indexed per each pod created by the workload that enters in `Running` condition during the workload:

Expand All @@ -29,12 +27,11 @@ One document, such as the following, is indexed per each pod created by the work
"containersReadyLatency": 2997,
"podReadyLatency": 2997,
"metricName": "podLatencyMeasurement",
"jobName": "kubelet-density",
"uuid": "c40b4346-7af7-4c63-9ab4-aae7ccdd0616",
"namespace": "kubelet-density",
"podName": "kubelet-density-13",
"jobConfig": {},
"nodeName": "worker-001"
"nodeName": "worker-001",
"jobConfig": {"config": "params"}
}
```

Expand All @@ -53,7 +50,9 @@ Pod latency quantile sample:
"avg": 2876.3,
"timestamp": "2020-11-15T22:26:51.553221077+01:00",
"metricName": "podLatencyQuantilesMeasurement",
"jobConfig": {}
"jobConfig": {
"config": "params"
}
},
{
"quantileName": "PodScheduled",
Expand All @@ -65,7 +64,9 @@ Pod latency quantile sample:
"avg": 5.38,
"timestamp": "2020-11-15T22:26:51.553225151+01:00",
"metricName": "podLatencyQuantilesMeasurement",
"jobConfig": {}
"jobConfig": {
"config": "params"
}
}
```

Expand Down Expand Up @@ -134,6 +135,101 @@ time="2023-11-19 17:46:08" level=info msg="Pod latencies error rate was: 0.00" f
time="2023-11-19 17:46:08" level=info msg="👋 Exiting kube-burner vchalla" file="kube-burner.go:209"
```
## Service latency
Calculates the time taken the services to serve requests once their endpoints are ready. This measurement works as follows.
```mermaid
graph LR
A[Service created] --> C{active endpoints?}
C -->|No| C
C -->|Yes| D[Save timestamp]
D --> G{TCP connectivity?}
G-->|Yes| F(Generate metric)
G -->|No| G
```

Where the service latency is the time elapsed since the service has at least one endpoint ready till the connectivity is verified.

The connectivity check is done through a pod running in the `kube-burner-service-latency` namespace, kube-burner connects to this pod and uses `netcat` to verify connectivity.

This measure is enabled with:

```yaml
measurements:
- name: serviceLatency
svcTimeout: 5s
```
Where `svcTimeout`, by default `5s`, defines the maximum amount of time the measurement will wait for a service to be ready, when this timeout is met, the metric from that service is **discarded**.

!!! warning "Considerations"
- Only TCP is supported.
- Supported services are `ClusterIP`, `NodePort` and `LoadBalancer`.
- kube-burner starts checking service connectivity when its endpoints object has at least one address.
- Make sure the endpoints of the service are correct and reachable from the pod running in the `kube-burner-service-latency`.
- When the service is `NodePort`, the connectivity check is done against the node where the connectivity check pods runs.
- By default all services created by the benchmark are tracked by this measurement, it's possible to discard service objects from tracking by annotating them with `kube-burner.io/service-latency=false`.
- Keep in mind that When service is `LoadBalancer` type, the provider needs to setup the load balancer, which adds some extra delay.
- Endpoints are pinged one after another, this can create some delay when the number of endpoints of the service is big.

### Metrics

The metrics collected are service latency timeseries (`svcLatencyMeasurement`) and another document that holds a summary with the different service latency quantiles (`svcLatencyQuantilesMeasurement`). It is possible to skip indexing the `svcLatencyMeasurement` metric by configuring the field `svcLatencyMetrics` of this measurement to `quantiles`. Metric documents have the following structure:

```json
{
"timestamp": "2023-11-19T00:41:51Z",
"ready": 1631880721,
"metricName": "svcLatencyMeasurement",
"jobConfig": {
"config": "params"
},
"uuid": "c4558ba8-1e29-4660-9b31-02b9f01c29bf",
"namespace": "cluster-density-v2-2",
"service": "cluster-density-1",
"type": "ClusterIP"
}
```

!!! note
When type is `LoadBalancer`, it includes an extra field `ipAssigned`, that reports the IP assignation latency of the service.

And the quantiles document has the structure:

```json
{
"quantileName": "Ready",
"uuid": "c4558ba8-1e29-4660-9b31-02b9f01c29bf",
"P99": 1867593282,
"P95": 1856488440,
"P50": 1723817691,
"max": 1868307027,
"avg": 1722308938,
"timestamp": "2023-11-19T00:42:26.663991359Z",
"metricName": "svcLatencyQuantilesMeasurement",
"jobConfig": {
"config": "params"
}
},
{
"quantileName": "LoadBalancer",
"uuid": "c4558ba8-1e29-4660-9b31-02b9f01c29bf",
"P99": 1467593282,
"P95": 1356488440,
"P50": 1323817691,
"max": 2168307027,
"avg": 1822308938,
"timestamp": "2023-11-19T00:42:26.663991359Z",
"metricName": "svcLatencyQuantilesMeasurement",
"jobConfig": {
"config": "params"
}
}
```

When there're `LoadBalancer` services, an extra document with `quantileName` as `LoadBalancer` is also generated as shown above.

## pprof collection

This measurement can be used to collect Golang profiling information from processes running in pods from the cluster. To do so, kube-burner connects to pods labeled with `labelSelector` and running in `namespace`. This measurement uses an implementation similar to `kubectl exec`, and as soon as it connects to one pod it executes the command `curl <pprofURL>` to get the pprof data. pprof files are collected in a regular basis configured by the parameter `pprofInterval`, the collected pprof files are downloaded from the pods to the local directory configured by the parameter `pprofDirectory` which by default is `pprof`.
Expand Down
2 changes: 1 addition & 1 deletion docs/observability/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ The collected metrics have the following shape:
]
```

Notice that kube-burner enriches the query results by adding some extra fields like `uuid`, `query`, `metricName` and `jobName`.
Notice that kube-burner enriches the query results by adding some extra fields like `uuid`, `query`, `metricName` and `jobConfig`.
!!! info
These extra fields are especially useful at the time of identifying and representing the collected metrics.

Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ require (
k8s.io/apimachinery v0.27.2
k8s.io/client-go v0.27.2
k8s.io/kubectl v0.27.2
k8s.io/utils v0.0.0-20230505201702-9f6742963106
k8s.io/utils v0.0.0-20240102154912-e7106e64919e
kubevirt.io/api v0.58.0
)

Expand Down
4 changes: 2 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -1042,8 +1042,8 @@ k8s.io/kubectl v0.27.2 h1:sSBM2j94MHBFRWfHIWtEXWCicViQzZsb177rNsKBhZg=
k8s.io/kubectl v0.27.2/go.mod h1:GCOODtxPcrjh+EC611MqREkU8RjYBh10ldQCQ6zpFKw=
k8s.io/utils v0.0.0-20210802155522-efc7438f0176/go.mod h1:jPW/WVKK9YHAvNhRxK0md/EJ228hCsBRufyofKtW8HA=
k8s.io/utils v0.0.0-20211116205334-6203023598ed/go.mod h1:jPW/WVKK9YHAvNhRxK0md/EJ228hCsBRufyofKtW8HA=
k8s.io/utils v0.0.0-20230505201702-9f6742963106 h1:EObNQ3TW2D+WptiYXlApGNLVy0zm/JIBVY9i+M4wpAU=
k8s.io/utils v0.0.0-20230505201702-9f6742963106/go.mod h1:OLgZIPagt7ERELqWJFomSt595RzquPNLL48iOWgYOg0=
k8s.io/utils v0.0.0-20240102154912-e7106e64919e h1:eQ/4ljkx21sObifjzXwlPKpdGLrCfRziVtos3ofG/sQ=
k8s.io/utils v0.0.0-20240102154912-e7106e64919e/go.mod h1:OLgZIPagt7ERELqWJFomSt595RzquPNLL48iOWgYOg0=
kubevirt.io/api v0.58.0 h1:qeNeRtD6AIJ5WVJuRXajmmXtnrO5dYchy+hpCm6QwhE=
kubevirt.io/api v0.58.0/go.mod h1:U0CQlZR0JoJCaC+Va0wz4dMOtYDdVywJ98OT1KmOkzI=
kubevirt.io/containerized-data-importer-api v1.50.0 h1:O01F8L5K8qRLnkYICIfmAu0dU0P48jdO42uFPElht38=
Expand Down
4 changes: 4 additions & 0 deletions hack/build_service_checker.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash
#
echo -e "FROM registry.fedoraproject.org/fedora-minimal:latest\nRUN microdnf install -y nmap-ncat procps-ng" | podman build --jobs=4 --platform=linux/amd64,linux/arm64,linux/ppc64le,linux/s390x --manifest=quay.io/cloud-bulldozer/fedora-nc:latest -f - .
podman manifest push quay.io/cloud-bulldozer/fedora-nc:latest
8 changes: 4 additions & 4 deletions pkg/burner/create.go
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ func (ex *Executor) RunCreateJob(iterationStart, iterationEnd int, waitListNames
}
if ex.nsRequired && !ex.NamespacedIterations {
ns = ex.Namespace
if err = createNamespace(ns, nsLabels, nsAnnotations); err != nil {
if err = util.CreateNamespace(ClientSet, ns, nsLabels, nsAnnotations); err != nil {
log.Fatal(err.Error())
}
*waitListNamespaces = append(*waitListNamespaces, ns)
Expand All @@ -137,7 +137,7 @@ func (ex *Executor) RunCreateJob(iterationStart, iterationEnd int, waitListNames
if ex.nsRequired && ex.NamespacedIterations {
ns = ex.generateNamespace(i)
if !namespacesCreated[ns] {
if err = createNamespace(ns, nsLabels, nsAnnotations); err != nil {
if err = util.CreateNamespace(ClientSet, ns, nsLabels, nsAnnotations); err != nil {
log.Error(err.Error())
continue
}
Expand Down Expand Up @@ -272,7 +272,7 @@ func (ex *Executor) replicaHandler(labels map[string]string, obj object, ns stri
func createRequest(gvr schema.GroupVersionResource, ns string, obj *unstructured.Unstructured, timeout time.Duration) {
var uns *unstructured.Unstructured
var err error
RetryWithExponentialBackOff(func() (bool, error) {
util.RetryWithExponentialBackOff(func() (bool, error) {
// When the object has a namespace already specified, use it
if objNs := obj.GetNamespace(); objNs != "" {
ns = objNs
Expand Down Expand Up @@ -368,7 +368,7 @@ func (ex *Executor) RunCreateJobWithChurn() {
if ex.ChurnDeletionStrategy == "gvr" {
CleanupNamespacesUsingGVR(ctx, *ex, namespacesToDelete)
}
CleanupNamespaces(ctx, "churndelete=delete")
util.CleanupNamespaces(ctx, ClientSet, "churndelete=delete")
log.Info("Re-creating deleted objects")
// Re-create objects that were deleted
ex.RunCreateJob(randStart, numToChurn+randStart, &[]string{})
Expand Down
3 changes: 2 additions & 1 deletion pkg/burner/delete.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import (
"time"

"github.com/kube-burner/kube-burner/pkg/config"
"github.com/kube-burner/kube-burner/pkg/util"
log "github.com/sirupsen/logrus"
"k8s.io/apimachinery/pkg/api/meta"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
Expand Down Expand Up @@ -65,7 +66,7 @@ func (ex *Executor) RunDeleteJob() {
listOptions := metav1.ListOptions{
LabelSelector: labelSelector,
}
err := RetryWithExponentialBackOff(func() (done bool, err error) {
err := util.RetryWithExponentialBackOff(func() (done bool, err error) {
itemList, err = DynamicClient.Resource(obj.gvr).List(context.TODO(), listOptions)
if err != nil {
log.Errorf("Error found listing %s labeled with %s: %s", obj.gvr.Resource, labelSelector, err)
Expand Down
3 changes: 2 additions & 1 deletion pkg/burner/job.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ import (
"github.com/kube-burner/kube-burner/pkg/config"
"github.com/kube-burner/kube-burner/pkg/measurements"
"github.com/kube-burner/kube-burner/pkg/prometheus"
"github.com/kube-burner/kube-burner/pkg/util"
"github.com/kube-burner/kube-burner/pkg/util/metrics"
log "github.com/sirupsen/logrus"
"golang.org/x/time/rate"
Expand Down Expand Up @@ -334,7 +335,7 @@ func garbageCollectJob(ctx context.Context, jobExecutor Executor, labelSelector
if wg != nil {
defer wg.Done()
}
CleanupNamespaces(ctx, labelSelector)
util.CleanupNamespaces(ctx, ClientSet, labelSelector)
for _, obj := range jobExecutor.objects {
jobExecutor.limiter.Wait(ctx)
if !obj.Namespaced {
Expand Down
Loading

0 comments on commit 99e0b8e

Please sign in to comment.