Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/receiver_creator] Add support for enabling receivers/scrapers from K8s hints #35617

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

ChrsMark
Copy link
Member

@ChrsMark ChrsMark commented Oct 4, 2024

Description

This PR implements the metrics' part suggested at #34427 (comment). The current implementation only supports hints coming from Pods (wip with Logs included-> https://github.com/open-telemetry/opentelemetry-collector-contrib/compare/main...ChrsMark:opentelemetry-collector-contrib:logs_hints?expand=1)

This PR implements what it was decided at #34427 (comment).

See the README docs for the description of this feature: https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/35617/files#diff-4127365c4062a7510fb7fede0fa239e9232549732898303d94c12fef0433d39d

TODOs:

  • turn-on/off the feature using a setting
  • add docs
  • add tests

Link to tracking issue

Part of #34427

Testing

Added unit-tests.

Documentation

Added

How to test this locally

  1. Using the follow values to install the Collector's Helm chart:
mode: daemonset

image:
  repository: otelcontribcol-dev
  tag: "latest"
  pullPolicy: IfNotPresent

command:
  name: otelcontribcol

clusterRole:
  create: true
  rules:
   - apiGroups:
     - ''
     resources:
     - 'pods'
     - 'nodes'
     verbs:
     - 'get'
     - 'list'
     - 'watch'
   - apiGroups: [ "" ]
     resources: [ "nodes/proxy"]
     verbs: [ "get" ]
   - apiGroups:
       - ""
     resources:
       - nodes/stats
     verbs:
       - get
   - nonResourceURLs:
       - "/metrics"
     verbs:
       - get

config:
  extensions:
    k8s_observer:
      auth_type: serviceAccount
      node: ${env:K8S_NODE_NAME}
      observe_nodes: true
  exporters:
    debug:
      verbosity: basic
  receivers:
    receiver_creator/metrics:
      watch_observers: [ k8s_observer ]
      discovery:
        enabled: true
      receivers:

  service:
    extensions: [health_check, k8s_observer]
    telemetry:
      logs:
        level: DEBUG
    pipelines:
      metrics:
        receivers: [ receiver_creator/metrics ]
        processors: [ batch ]
        exporters: [ debug ]
  1. Use the following target Pod which consists of 2 containers:
apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-conf
data:
  nginx.conf: |
    user  nginx;
    worker_processes  1;
    error_log  /dev/stderr warn;
    pid        /var/run/nginx.pid;
    events {
      worker_connections  1024;
    }
    http {
      include       /etc/nginx/mime.types;
      default_type  application/octet-stream;
    
      log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                        '$status $body_bytes_sent "$http_referer" '
                        '"$http_user_agent" "$http_x_forwarded_for"';
      access_log  /dev/stdout main;
      server {
          listen 80;
          server_name localhost;
    
          location /nginx_status {
              stub_status on;
          }
      }
      include /etc/nginx/conf.d/*;
    }
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis-deployment
  labels:
    app: redis
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
      annotations:
        # redis container port hints
        io.opentelemetry.discovery.metrics.6379/enabled: "true"
        io.opentelemetry.discovery.metrics.6379/scraper: redis
        io.opentelemetry.discovery.metrics.6379/config: |
          collection_interval: "20s"
          timeout: "10s"

        # nginx container port hints
        io.opentelemetry.discovery.metrics.80/enabled: "true"
        io.opentelemetry.discovery.metrics.80/scraper: nginx
        io.opentelemetry.discovery.metrics.80/config: |
          endpoint: "http://`endpoint`/nginx_status"
          collection_interval: "30s"
          timeout: "20s"
    spec:
      volumes:
      - name: nginx-conf
        configMap:
          name: nginx-conf
          items:
            - key: nginx.conf
              path: nginx.conf
      containers:
        - name: webserver
          image: nginx:latest
          ports:
            - containerPort: 80
              name: webserver
          volumeMounts:
            - mountPath: /etc/nginx/nginx.conf
              readOnly: true
              subPath: nginx.conf
              name: nginx-conf
        - image: redis
          imagePullPolicy: IfNotPresent
          name: redis
          ports:
            - name: redis
              containerPort: 6379
              protocol: TCP
  1. After the target pod is deployed check the Collector logs and verify that 2 receivers are started properly and metrics are collected from both target containers:
2024-10-04T13:53:02.550Z	info	[email protected]/observerhandler.go:115	starting receiver	{"kind": "receiver", "name": "receiver_creator/metrics", "data_type": "metrics", "name": "nginx/e6d880c9-13a0-45e7-a3a1-27fa021802b4_80", "endpoint": "10.244.0.52:80", "endpoint_id": "k8s_observer/e6d880c9-13a0-45e7-a3a1-27fa021802b4/webserver(80)", "config": {"collection_interval":"20s","endpoint":"http://`endpoint`/nginx_status"}}
....
2024-10-04T13:53:02.550Z	info	[email protected]/observerhandler.go:115	starting receiver	{"kind": "receiver", "name": "receiver_creator/metrics", "data_type": "metrics", "name": "redis/e6d880c9-13a0-45e7-a3a1-27fa021802b4_6379", "endpoint": "10.244.0.52:6379", "endpoint_id": "k8s_observer/e6d880c9-13a0-45e7-a3a1-27fa021802b4/redis(6379)", "config": {"collection_interval":"20s","endpoint":"10.244.0.52:6379"}}
2024-10-04T13:53:03.646Z	info	MetricsExporter	{"kind": "exporter", "data_type": "metrics", "name": "debug", "resource metrics": 2, "metrics": 30, "data points": 38}
2024-10-04T13:53:06.547Z	debug	memorylimiter/memorylimiter.go:181	Currently used memory.	{"kind": "processor", "name": "memory_limiter", "pipeline": "traces", "cur_mem_mib": 39}
2024-10-04T13:53:11.547Z	debug	memorylimiter/memorylimiter.go:181	Currently used memory.	{"kind": "processor", "name": "memory_limiter", "pipeline": "traces", "cur_mem_mib": 39}
2024-10-04T13:53:16.546Z	debug	memorylimiter/memorylimiter.go:181	Currently used memory.	{"kind": "processor", "name": "memory_limiter", "pipeline": "traces", "cur_mem_mib": 40}

More testing target Pods
busybox-multi-logs.txt
busybox-single-logs.txt
redis-nginx-mutli.txt
redis-single.txt
redis-single-logs.txt

@ChrsMark
Copy link
Member Author

ChrsMark commented Oct 7, 2024

@dmitryax I have opened this for review, please feel free to take a look when you get the chance.

Since #35544 was merged I could also add the Logs' part here if that's easier for you to review, however that would make the PR significantly bigger. Let me know what you prefer.

Update Logs included-> https://github.com/open-telemetry/opentelemetry-collector-contrib/compare/main...ChrsMark:opentelemetry-collector-contrib:logs_hints?expand=1

@ChrsMark ChrsMark force-pushed the hints branch 11 times, most recently from 5b8845d to 7c65ab2 Compare October 11, 2024 12:55
receiver/receivercreator/discovery.go Outdated Show resolved Hide resolved
receiver/receivercreator/discovery.go Outdated Show resolved Hide resolved
receiver/receivercreator/discovery.go Outdated Show resolved Hide resolved
@ChrsMark
Copy link
Member Author

ChrsMark commented Nov 19, 2024

@dmitryax based on the discussion at #35617 (comment) I have changed the io.opentelemetry.discovery.metrics/enabled hint to mandatory with only allowed values the true or false.

PTAL.

}
return k8sHintsBuilder{
logger: logger,
config: config,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to store the full config if we have a separate field for ignoreReceivers? Can be just enabled right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even enabled seems like not being used...

Copy link
Member Author

@ChrsMark ChrsMark Nov 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Must be a leftover after some of the re-designs, good catch! Removed that at
ddc4655, thank's!

Copy link
Member

@dmitryax dmitryax left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a nit

@ChrsMark
Copy link
Member Author

Thank's @dmitryax, I've removed the redundant config field. It should be good to go now.

Comment on lines +31 to +34
type hintsTemplatesBuilder interface {
createReceiverTemplateFromHints(env observer.EndpointEnv) (*receiverTemplate, error)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface doesn't seem like being used anywhere, right?

Copy link
Member Author

@ChrsMark ChrsMark Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now it's only implemented by the k8sHintsBuilder. The idea is to give the option for implementing it in the various different environments like k8s, docker etc. This feature would make sense for docker as well.

return userConfigMap{}
}

if !strings.Contains(confEndpoint, defaultEndpoint) {
Copy link
Member

@dmitryax dmitryax Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a couple of questions here:

  1. Will this work for annotation with "`endpoint`" string in it? like the one we have in the readme
io.opentelemetry.discovery.metrics/config: |
  endpoint: "http://`endpoint`/nginx_status"
  1. I think "contains" isn't a strong enough check. Someone can configure the annotation to be
    "http://another-endpoint/nginx_status?my-endpoint:8888" and if collector will scrape "another-endpoint". I think we should use URI pattern check here instead

Copy link
Member Author

@ChrsMark ChrsMark Nov 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right to both. The validation had to be improved indeed. Fixed that in the last commit(9f4ebcb).

Signed-off-by: ChrsMark <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants