Skip to content

Commit

Permalink
Merge branch 'main' into add_error_tracking
Browse files Browse the repository at this point in the history
  • Loading branch information
marctc authored Jul 15, 2024
2 parents 191d61b + b3b01db commit af6aad3
Show file tree
Hide file tree
Showing 21 changed files with 263 additions and 51 deletions.
6 changes: 1 addition & 5 deletions charts/beyla/templates/cluster-role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,8 @@ rules:
resources: [ "replicasets" ]
verbs: [ "list", "watch" ]
- apiGroups: [ "" ]
{{- if or (eq .Values.preset "network") .Values.config.data.network }}
resources: [ "pods", "services", "nodes" ]
{{- else }}
resources: [ "pods" ]
{{- end }}
verbs: [ "list", "watch" ]
verbs: [ "list", "watch", "get" ]
{{- with .Values.rbac.extraClusterRoleRules }}
{{- toYaml . | nindent 2 }}
{{- end}}
Expand Down
5 changes: 4 additions & 1 deletion cmd/beyla/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,10 @@ func main() {
// child process isn't found.
ctx, _ := signal.NotifyContext(context.Background(), os.Interrupt, syscall.SIGINT, syscall.SIGTERM)

components.RunBeyla(ctx, config)
if err := components.RunBeyla(ctx, config); err != nil {
slog.Error("Beyla can't start", "error", err)
os.Exit(-1)
}

if gc := os.Getenv("GOCOVERDIR"); gc != "" {
slog.Info("Waiting 1s to collect coverage data...")
Expand Down
12 changes: 6 additions & 6 deletions docs/sources/configure/options.md
Original file line number Diff line number Diff line change
Expand Up @@ -835,7 +835,7 @@ The purpose of this value is to avoid reporting indefinitely finished applicatio

| YAML | Environment variable | Type | Default |
|------------|-------------------------------|-----------------|------------------------------|
| `features` | `BEYLA_OTEL_METRICS_FEATURES` | list of strings | `["application", "network"]` |
| `features` | `BEYLA_OTEL_METRICS_FEATURES` | list of strings | `["application"]` |

A list of metric groups which are allowed to be exported. Each group belongs to a different feature
of Beyla: application-level metrics or network metrics.
Expand All @@ -853,8 +853,8 @@ of Beyla: application-level metrics or network metrics.
the OpenTelemetry service names used in Beyla. In Kubernetes environments, the OpenTelemetry service name set by the service name
discovery is the best choice for service graph metrics.
- If the list contains `network`, the Beyla OpenTelemetry exporter exports network-level
metrics; but only if there is defined an OpenTelemetry endpoint and the
[network metrics are enabled]({{< relref "../network" >}}).
metrics; but only if there is an OpenTelemetry endpoint defined. For network-level metrics options visit the
[network metrics]({{< relref "../network" >}}) configuration documentation.

Usually you do not need to change this configuration option, unless, for example, a Beyla instance
instruments both network and applications, and you want to disable application-level metrics because
Expand Down Expand Up @@ -1214,7 +1214,7 @@ The `buckets` object allows overriding the bucket boundaries of diverse histogra

| YAML | Environment variable | Type | Default |
|------------|-----------------------------|-----------------|------------------------------|
| `features` | `BEYLA_PROMETHEUS_FEATURES` | list of strings | `["application", "network"]` |
| `features` | `BEYLA_PROMETHEUS_FEATURES` | list of strings | `["application"]` |

A list of metric groups that are allowed to be exported. Each group belongs to a different feature
of Beyla: application-level metrics or network metrics.
Expand All @@ -1232,8 +1232,8 @@ of Beyla: application-level metrics or network metrics.
the OpenTelemetry service names used in Beyla. In Kubernetes environments, the OpenTelemetry service name set by the service name
discovery is the best choice for service graph metrics.
- If the list contains `network`, the Beyla Prometheus exporter exports network-level
metrics; but only if the Prometheus `port` property is defined and the
[network metrics are enabled]({{< relref "../network" >}}).
metrics; but only if the Prometheus `port` property is defined. For network-level metrics options visit the
[network metrics]({{< relref "../network" >}}) configuration documentation.

Usually you do not need to change this configuration option, unless, for example, a Beyla instance
instruments both network and applications, and you want to disable application-level metrics because
Expand Down
64 changes: 64 additions & 0 deletions docs/sources/network/asserts.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: Set up Beyla network metrics in Kubernetes with Helm for Asserts
menuTitle: Set up Asserts network
description: A guide to install Beyla network metrics in Kubernetes with Helm for Asserts.
weight: 1
keywords:
- Beyla
- eBPF
- Network
---

# Set up Beyla network metrics in Kubernetes with Helm for Asserts

[Asserts](/docs/grafana-cloud/monitor-applications/asserts/) works with Beyla and requires Beyla network metrics. Learn how to set up Beyla network metrics in Kubernetes with Helm to export telemetry data to Asserts.

To learn more about Beyla network metrics, consult the [Network](/docs/beyla/latest/network/) documentation.

## Prerequisites

Before you install Beyla network metrics and export telemetry data to Asserts you need:

1. A free Grafana Cloud account.
1. Access rights to a Kubernetes cluster, enough to create components with privileges.

You can register for a [free forever Grafana Cloud account](/auth/sign-up/create-user) in minutes and start sending telemetry data and monitoring your infrastructure and applications.

There are two configuration options to collect metrics to send to Grafana Cloud for Asserts. First, through Kubernetes monitoring or alternatively with an OpenTelemetry Collector.

## Configuration for Kubernetes monitoring

If you use Kubernetes monitoring and a Helm chart for scraping metrics, create a `values.yml` with the following configuration:

```yaml
preset: network

podAnnotations:
k8s.grafana.com/scrape: true
k8s.grafana.com/job: beyla-network
k8s.grafana.com/metrics.portName: metrics
```
## Configure for OpenTelemetry Collector
If you use an OpenTelemetry Collector for metrics collection, either Grafana Alloy the upstream collector, create a `values.yml` with the following configuration:

```sh
preset: network
env:
OTEL_EXPORTER_OTLP_ENDPOINT: your-otlp-endpoint:4318
```

## Install and run Beyla network metrics for Asserts

Run the following `helm` commands to add the `grafana` repository and install and run `beyla` with your configuration for network metrics:

```sh
helm repo add grafana https://grafana.github.io/helm-charts
helm install beyla --create-namespace -n beyla -f values.yaml grafana/beyla
```

## Observe your services in Asserts

Finally, navigate to Asserts in [Grafana Cloud](/auth/sign-in/) and view your instrumented services.
14 changes: 8 additions & 6 deletions docs/sources/network/config.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Beyla Network Metrics configuration options
menuTitle: Configuration
description: Learn about the configuration options available for Beyla network metrics
weight: 2
weight: 3
keywords:
- Beyla
- eBPF
Expand Down Expand Up @@ -52,19 +52,21 @@ network metrics (in the previous example, `otel_metrics_export`, but it also acc
| -------- | ----------------------- | ------- | ------- |
| `enable` | `BEYLA_NETWORK_METRICS` | boolean | `false` |

Enables network metrics reporting in Beyla.
Explicitly enables network metrics reporting in Beyla. You can also enable network metrics reporting
by adding `network` to the list of `features` for [otel_metrics_export]({{< relref "../configure/options.md#otel-metrics-exporter" >}}))
or [prometheus_export]({{< relref "../configure/options.md#prometheus-http-endpoint" >}})).

| YAML | Environment variable | Type | Default |
| -------------------- | ---------------------------------- | -------- | -------- |
| `source` | `BEYLA_NETWORK_SOURCE` | string | `tc` |
| YAML | Environment variable | Type | Default |
| -------------------- | ---------------------------------- | -------- | ------------------- |
| `source` | `BEYLA_NETWORK_SOURCE` | string | `socket_filter` |

Specifies the Linux Kernel feature used to source the network events Beyla reports.

The available options are: `tc` and `socket_filter`.

When `tc` is used as an event source, Beyla uses the Linux Traffic Control ingress and egress
filters to capture the network events, in a direct action mode. This event source mode assumes
that no other eBPF programs are attaching to the same Linux Traffic Control interface, in
that no other eBPF programs are attaching to the same Linux Traffic Control interface, in
direct action mode. For example, the Cilium Kubernetes CNI uses the same approach, therefore
if you have Cilium CNI installed in your Kubernetes cluster, configure Beyla to capture the
network events with the `socket_filter` mode.
Expand Down
14 changes: 11 additions & 3 deletions pkg/beyla/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ var DefaultConfig = Config{
Buckets: otel.DefaultBuckets,
ReportersCacheLen: ReporterLRUSize,
HistogramAggregation: otel.AggregationExplicit,
Features: []string{otel.FeatureNetwork, otel.FeatureApplication},
Features: []string{otel.FeatureApplication},
Instrumentations: []string{
instrumentations.InstrumentationALL,
},
Expand All @@ -86,7 +86,7 @@ var DefaultConfig = Config{
Prometheus: prom.PrometheusConfig{
Path: "/metrics",
Buckets: otel.DefaultBuckets,
Features: []string{otel.FeatureNetwork, otel.FeatureApplication},
Features: []string{otel.FeatureApplication},
Instrumentations: []string{
instrumentations.InstrumentationALL,
},
Expand Down Expand Up @@ -234,11 +234,19 @@ func (c *Config) Validate() error {
return nil
}

func (c *Config) promNetO11yEnabled() bool {
return c.Prometheus.Enabled() && c.Prometheus.NetworkMetricsEnabled()
}

func (c *Config) otelNetO11yEnabled() bool {
return (c.Metrics.Enabled() || c.Grafana.OTLP.MetricsEnabled()) && c.Metrics.NetworkMetricsEnabled()
}

// Enabled checks if a given Beyla feature is enabled according to the global configuration
func (c *Config) Enabled(feature Feature) bool {
switch feature {
case FeatureNetO11y:
return c.NetworkFlows.Enable
return c.NetworkFlows.Enable || c.promNetO11yEnabled() || c.otelNetO11yEnabled()
case FeatureAppO11y:
return c.Port.Len() > 0 || c.Exec.IsSet() || len(c.Discovery.Services) > 0 || c.Discovery.SystemWide
}
Expand Down
24 changes: 22 additions & 2 deletions pkg/beyla/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ network:
DurationHistogram: []float64{0, 1, 2},
RequestSizeHistogram: otel.DefaultBuckets.RequestSizeHistogram,
},
Features: []string{"network", "application"},
Features: []string{"application"},
Instrumentations: []string{
instrumentations.InstrumentationALL,
},
Expand All @@ -140,7 +140,7 @@ network:
},
Prometheus: prom.PrometheusConfig{
Path: "/metrics",
Features: []string{otel.FeatureNetwork, otel.FeatureApplication},
Features: []string{otel.FeatureApplication},
Instrumentations: []string{
instrumentations.InstrumentationALL,
},
Expand Down Expand Up @@ -288,6 +288,26 @@ func TestConfig_OtelGoAutoEnv(t *testing.T) {
assert.True(t, cfg.Exec.IsSet()) // Exec maps to BEYLA_EXECUTABLE_NAME
}

func TestConfig_NetworkImplicit(t *testing.T) {
// OTEL_GO_AUTO_TARGET_EXE is an alias to BEYLA_EXECUTABLE_NAME
// (Compatibility with OpenTelemetry)
require.NoError(t, os.Setenv("OTEL_EXPORTER_OTLP_ENDPOINT", "http://localhost:4318"))
require.NoError(t, os.Setenv("BEYLA_OTEL_METRIC_FEATURES", "network"))
cfg, err := LoadConfig(bytes.NewReader(nil))
require.NoError(t, err)
assert.True(t, cfg.Enabled(FeatureNetO11y)) // Net o11y should be on
}

func TestConfig_NetworkImplicitProm(t *testing.T) {
// OTEL_GO_AUTO_TARGET_EXE is an alias to BEYLA_EXECUTABLE_NAME
// (Compatibility with OpenTelemetry)
require.NoError(t, os.Setenv("BEYLA_PROMETHEUS_PORT", "9090"))
require.NoError(t, os.Setenv("BEYLA_PROMETHEUS_FEATURES", "network"))
cfg, err := LoadConfig(bytes.NewReader(nil))
require.NoError(t, err)
assert.True(t, cfg.Enabled(FeatureNetO11y)) // Net o11y should be on
}

func loadConfig(t *testing.T, env map[string]string) *Config {
for k, v := range env {
require.NoError(t, os.Setenv(k, v))
Expand Down
2 changes: 1 addition & 1 deletion pkg/beyla/network_cfg.go
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ type NetworkConfig struct {
}

var defaultNetworkConfig = NetworkConfig{
Source: EbpfSourceTC,
Source: EbpfSourceSock,
AgentIPIface: "external",
AgentIPType: "any",
ExcludeInterfaces: []string{"lo"},
Expand Down
60 changes: 46 additions & 14 deletions pkg/components/beyla.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,16 @@ package components

import (
"context"
"fmt"
"log/slog"
"os"
"slices"
"sync"

"github.com/grafana/beyla/pkg/beyla"
"github.com/grafana/beyla/pkg/internal/appolly"
"github.com/grafana/beyla/pkg/internal/connector"
"github.com/grafana/beyla/pkg/internal/export/attributes"
"github.com/grafana/beyla/pkg/internal/export/otel"
"github.com/grafana/beyla/pkg/internal/imetrics"
"github.com/grafana/beyla/pkg/internal/kube"
"github.com/grafana/beyla/pkg/internal/netolly/agent"
Expand All @@ -19,7 +21,7 @@ import (

// RunBeyla in the foreground process. This is a blocking function and won't exit
// until both the AppO11y and NetO11y components end
func RunBeyla(ctx context.Context, cfg *beyla.Config) {
func RunBeyla(ctx context.Context, cfg *beyla.Config) error {
ctxInfo := buildCommonContextInfo(cfg)

wg := sync.WaitGroup{}
Expand All @@ -32,22 +34,33 @@ func RunBeyla(ctx context.Context, cfg *beyla.Config) {
wg.Add(1)
}

errs := make(chan error, 2)
if app {
go func() {
defer wg.Done()
setupAppO11y(ctx, ctxInfo, cfg)
if err := setupAppO11y(ctx, ctxInfo, cfg); err != nil {
errs <- err
}
}()
}
if net {
go func() {
defer wg.Done()
setupNetO11y(ctx, ctxInfo, cfg)
if err := setupNetO11y(ctx, ctxInfo, cfg); err != nil {
errs <- err
}
}()
}
wg.Wait()
select {
case err := <-errs:
return err
default:
return nil
}
}

func setupAppO11y(ctx context.Context, ctxInfo *global.ContextInfo, config *beyla.Config) {
func setupAppO11y(ctx context.Context, ctxInfo *global.ContextInfo, config *beyla.Config) error {
slog.Info("starting Beyla in Application Observability mode")
// TODO: when we split Beyla in two processes with different permissions, this code can be split:
// in two parts:
Expand All @@ -56,26 +69,45 @@ func setupAppO11y(ctx context.Context, ctxInfo *global.ContextInfo, config *beyl

instr := appolly.New(ctx, ctxInfo, config)
if err := instr.FindAndInstrument(); err != nil {
slog.Error("Beyla couldn't find target process", "error", err)
os.Exit(-1)
return fmt.Errorf("can't find target process: %w", err)
}
if err := instr.ReadAndForward(); err != nil {
slog.Error("Beyla couldn't start read and forwarding", "error", err)
os.Exit(-1)
return fmt.Errorf("can't start read and forwarding: %w", err)
}
return nil
}

func setupNetO11y(ctx context.Context, ctxInfo *global.ContextInfo, cfg *beyla.Config) {
func setupNetO11y(ctx context.Context, ctxInfo *global.ContextInfo, cfg *beyla.Config) error {
if msg := mustSkip(cfg); msg != "" {
slog.Warn(msg + ". Skipping Network metrics component")
return nil
}
slog.Info("starting Beyla in Network metrics mode")
flowsAgent, err := agent.FlowsAgent(ctxInfo, cfg)
if err != nil {
slog.Error("can't start network metrics capture", "error", err)
os.Exit(-1)
return fmt.Errorf("can't start network metrics capture: %w", err)
}
if err := flowsAgent.Run(ctx); err != nil {
slog.Error("can't start network metrics capture", "error", err)
os.Exit(-1)
return fmt.Errorf("can't start network metrics capture: %w", err)
}
return nil
}

func mustSkip(cfg *beyla.Config) string {
otelEnabled := cfg.Metrics.Enabled()
otelFeature := slices.Contains(cfg.Metrics.Features, otel.FeatureNetwork)
promEnabled := cfg.Prometheus.Enabled()
promFeature := slices.Contains(cfg.Prometheus.Features, otel.FeatureNetwork)
if otelEnabled && !otelFeature && !promEnabled {
return "network not present in BEYLA_OTEL_METRICS_FEATURES"
}
if promEnabled && !promFeature && !otelEnabled {
return "network not present in BEYLA_PROMETHEUS_FEATURES"
}
if promEnabled && !promFeature && otelEnabled && !otelFeature {
return "network not present neither in BEYLA_PROMETHEUS_FEATURES nor BEYLA_OTEL_METRICS_FEATURES"
}
return ""
}

// BuildContextInfo populates some globally shared components and properties
Expand Down
Loading

0 comments on commit af6aad3

Please sign in to comment.