Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beyla filtering out a Python process #1486

Open
gouthamve opened this issue Jan 2, 2025 · 0 comments
Open

Beyla filtering out a Python process #1486

gouthamve opened this issue Jan 2, 2025 · 0 comments

Comments

@gouthamve
Copy link
Member

I have a weird situation where I am running a pod with two containers, one Python service and one Golang service. The Golang service is being detected and instrumented, however I am not seeing any metrics or traces for the Python service.

The Python service in question is run using uvicorn, and ps -ef on the host gives the following:

nobody      3073    3037  0 Jan01 ?        00:08:35 /opt/pysetup/.venv/bin/python /opt/pysetup/.venv/bin/uvicorn beaverhabits.main:app --workers 1 --log-level info --port 8080 --host 0.0.0.0

I am running this pod in k3s on a Raspberry Pi 5, compiled with the following settings: https://www.gouthamve.dev/building-the-raspberry-pi-kernel-for-ebpf-beyla/

I am running Beyla v1.9.4.

Beyla Config:
➜  officelab git:(main) ✗ k -n monitoring get cm k8s-monitoring-beyla -oyaml
apiVersion: v1
data:
  beyla-config.yml: |-
    attributes:
      kubernetes:
        cluster_name: office
        enable: true
      select:
        beyla_network_flow_bytes:
          include:
          - k8s.src.owner.type
          - k8s.dst.owner.type
          - direction
    discovery:
      services:
      - k8s_namespace: .
    filter:
      network:
        k8s_dst_owner_name:
          not_match: '{kube*,*jaeger-agent*,*prometheus*,*promtail*,*grafana-agent*}'
        k8s_src_owner_name:
          not_match: '{kube*,*jaeger-agent*,*prometheus*,*promtail*,*grafana-agent*}'
    internal_metrics:
      prometheus:
        path: /internal/metrics
        port: 9090
    otel_traces_export:
      endpoint: http://k8s-monitoring-alloy.monitoring.svc.cluster.local:4317/
    prometheus_export:
      features:
      - application
      - network
      - application_service_graph
      - application_span
      path: /metrics
      port: 9090
    routes:
      patterns:
      - /bookmarks
      - /bookmarks/unread
      - /bookmarks/archives
      - /bookmarks/labels
      - /bookmarks/videos
      - /bookmarks/labels/{id}
      - /bookmarks/{id}
      - /api/bookmarks/{id}/share
      - /api/bookmarks/{id}/annotations
      - /api/bookmarks/{id}/article
      - /manifest.webmanifest
      - /@b/{id}
      - /asserts/*
      - /bm/*
      - /_nicegui*
      - /favicon.ico
      unmatched: heuristic
Default logs:
1735807239724	time=2025-01-02T08:40:39.724Z level=INFO msg="Grafana Beyla" Version=v1.9.4 Revision=62fb834 "OpenTelemetry SDK Version"=1.28.0
1735807241045	time=2025-01-02T08:40:41.044Z level=INFO msg="got host ID" component=ContextInfo func=fetchHostID fetcher=KubeNode hostID=6d1a1259851d4c42b2c2fb51855b6be0
1735807241045	time=2025-01-02T08:40:41.044Z level=INFO msg="starting Beyla in Network metrics mode"
1735807241045	time=2025-01-02T08:40:41.044Z level=INFO msg="starting Beyla in Application Observability mode"
1735807241045	time=2025-01-02T08:40:41.044Z level=INFO msg="initializing Flows agent" component=agent.Flows
1735807241045	time=2025-01-02T08:40:41.044Z level=INFO msg="using socket filter for collecting network events" component=agent.Flows
1735807241207	time=2025-01-02T08:40:41.207Z level=INFO msg="starting Flows agent" component=agent.Flows
1735807241207	time=2025-01-02T08:40:41.207Z level=INFO msg="interface detected. Registering flow ebpfFetcher" component=agent.Flows interface="{Name:eth0 Index:2}"
1735807241248	time=2025-01-02T08:40:41.247Z level=INFO msg="using hostname" component=traces.ReadDecorator function=instance_ID_hostNamePIDDecorator hostname=k8s-monitoring-beyla-m5jdn
1735807241248	time=2025-01-02T08:40:41.247Z level=INFO msg="Flows agent successfully started" component=agent.Flows
1735807241248	time=2025-01-02T08:40:41.248Z level=INFO msg="opening prometheus scrape endpoint" component=connector.PrometheusManager port=9090 path=/internal/metrics
1735807241248	time=2025-01-02T08:40:41.248Z level=INFO msg="opening prometheus scrape endpoint" component=connector.PrometheusManager port=9090 path=/metrics
1735807241249	time=2025-01-02T08:40:41.249Z level=INFO msg="Starting main node" component=beyla.Instrumenter
1735807247194	time=2025-01-02T08:40:47.191Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/local/memos/memos pid=2756 ino=110368885 type=go
1735807248044	time=2025-01-02T08:40:48.044Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/csi-node-driver-registrar pid=2114 ino=109981284 type=go
1735807248067	time=2025-01-02T08:40:48.067Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/bin/coreutils pid=173469 ino=110645714 type=generic
1735807248765	time=2025-01-02T08:40:48.765Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/local/bin/litestream pid=2883 ino=110254972 type=generic
1735807248774	time=2025-01-02T08:40:48.774Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/bin/bash pid=2496 ino=110376675 type=generic
1735807248778	time=2025-01-02T08:40:48.778Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/configmap-reload pid=11802 ino=109852878 type=go
1735807248807	time=2025-01-02T08:40:48.807Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/bin/busybox pid=3356 ino=110241576 type=generic
1735807248812	time=2025-01-02T08:40:48.812Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/bin/alloy pid=11765 ino=110362638 type=generic
1735807248822	time=2025-01-02T08:40:48.822Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/local/bin/cephcsi pid=2482 ino=110756546 type=go
1735807248835	time=2025-01-02T08:40:48.835Z level=INFO msg="skipping self-instrumentation of Beyla process" component=discover.TraceAttacher cmd=/beyla pid=177384
1735807248835	time=2025-01-02T08:40:48.835Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/local/bin/python3.12 pid=3073 ino=110498680 type=python
1735807248844	time=2025-01-02T08:40:48.844Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/bin/node_exporter pid=2333 ino=109840374 type=go
1735807248893	time=2025-01-02T08:40:48.893Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/local/slash/slash pid=3044 ino=110257473 type=go
1735807248908	time=2025-01-02T08:40:48.908Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/bin/dash pid=3037 ino=110494101 type=generic
1735807481617	time=2025-01-02T08:44:41.617Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/bin/coreutils pid=178339 ino=110645714 type=generic
1735808381819	time=2025-01-02T08:59:41.819Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/bin/coreutils pid=180216 ino=110645714 type=generic
1735809282032	time=2025-01-02T09:14:42.031Z level=INFO msg="instrumenting process" component=discover.TraceAttacher cmd=/usr/bin/coreutils pid=182107 ino=110645714 type=generic
1735810182232	time=2025-01-02T09:29:42.232Z
Debug logs filtered on "pid=3073"

  |   | time=2025-01-01T18:25:09.113Z level=DEBUG msg="Swapped connection info for userspace, client = 1, orig_dport = 55616" component=BPFLogger pid=3073 comm=uvicorn |  
-- | -- | -- | --
  |   | time=2025-01-01T18:25:09.113Z level=DEBUG msg="=== kretprobe tcp_sendmsg=3073 sent 6===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.115Z level=DEBUG msg="=== tcp_recvmsg id=3073 sock=ffff80010e525a00 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.115Z level=DEBUG msg="Checking if we need to finish the request per thread id" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.115Z level=DEBUG msg="Checking if we need to finish the request per socket" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.115Z level=DEBUG msg="=== tcp_cleanup_rbuf id=3073 copied_len 10 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.115Z level=DEBUG msg="=== return recvmsg id=3073 args=ffffc00089af7a60 copied_len 10 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.115Z level=DEBUG msg="iter_type=5" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.115Z level=DEBUG msg="nr_segs=1, iov=00000000dcde8561, ubuf=00000000dcde8561" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.115Z level=DEBUG msg="buf=[\x8a\x84\xef:+5h\xfd\xb2l\x8e\xbf\xac\xa6\xb7X=\x81\x06\xa9ŀ3\x8f\xe0r\xd6\xef\x7f\xff\xff\xff\n], pid=3073, len=10" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.115Z level=DEBUG msg="Sending TCP trace ffffc00083600188, response length 10" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.115Z level=DEBUG msg="=== kretprobe_tcp_recvmsg id=3073 copied_len 10 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.116Z level=DEBUG msg="=== return recvmsg id=3073 args=0 copied_len 10 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.345Z level=DEBUG msg="=== tcp_rcv_established id=3073 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.346Z level=DEBUG msg="=== tcp_rcv_established id=3073 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.867Z level=DEBUG msg="=== tcp_rcv_established id=3073 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.983Z level=DEBUG msg="=== kprobe tcp_sendmsg=3073 sock=ffff80010e525a00 size 5===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.983Z level=DEBUG msg="[conn] s_h = 0, s_l = b8052a0affff0000, s_port=8080" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.983Z level=DEBUG msg="[conn] d_h = 0, d_l = 23002a0affff0000, d_port=55616" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.983Z level=DEBUG msg="iter_type=5" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.983Z level=DEBUG msg="nr_segs=1, iov=0000000040b50af4, ubuf=0000000040b50af4" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.983Z level=DEBUG msg="buf=[\xc1\x032\x02], pid=3073, len=5" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.983Z level=DEBUG msg="Swapped connection info for userspace, client = 1, orig_dport = 55616" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.983Z level=DEBUG msg="=== kretprobe tcp_sendmsg=3073 sent 5===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="=== tcp_recvmsg id=3073 sock=ffff80010e525a00 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="Checking if we need to finish the request per thread id" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="Checking if we need to finish the request per socket" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="=== tcp_cleanup_rbuf id=3073 copied_len 9 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="=== return recvmsg id=3073 args=ffffc00089af7928 copied_len 9 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="iter_type=5" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="nr_segs=1, iov=00000000dcde8561, ubuf=00000000dcde8561" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="buf=[\xc1\x83\x9bL\xf6\xc0\xa9J\xf6U\xb4\xac\x8aW3\xaa\xd9\xf4\xc1oi\xf0\x89_\xe0r\xd6\xef\x7f\xff\xff\xff\t], pid=3073, len=9" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="Sending TCP trace ffffc00083600188, response length 9" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="=== kretprobe_tcp_recvmsg id=3073 copied_len 9 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:09.987Z level=DEBUG msg="=== return recvmsg id=3073 args=0 copied_len 9 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.252Z level=DEBUG msg="=== tcp_rcv_established id=3073 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.252Z level=DEBUG msg="=== tcp_rcv_established id=3073 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.494Z level=DEBUG msg="=== kprobe tcp_sendmsg=3073 sock=ffff80010e52a200 size 5===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.494Z level=DEBUG msg="[conn] s_h = 0, s_l = b8052a0affff0000, s_port=8080" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.494Z level=DEBUG msg="[conn] d_h = 0, d_l = 23002a0affff0000, d_port=55636" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.495Z level=DEBUG msg="iter_type=5" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.495Z level=DEBUG msg="nr_segs=1, iov=000000002fd6145b, ubuf=000000002fd6145b" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.495Z level=DEBUG msg="buf=[\xc1\x032\x02], pid=3073, len=5" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.495Z level=DEBUG msg="Swapped connection info for userspace, client = 1, orig_dport = 55636" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.495Z level=DEBUG msg="=== kretprobe tcp_sendmsg=3073 sent 5===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="=== tcp_recvmsg id=3073 sock=ffff80010e52a200 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="Checking if we need to finish the request per thread id" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="Checking if we need to finish the request per socket" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="=== tcp_cleanup_rbuf id=3073 copied_len 9 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="=== return recvmsg id=3073 args=ffffc00089af7928 copied_len 9 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="iter_type=5" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="nr_segs=1, iov=00000000dcde8561, ubuf=00000000dcde8561" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="buf=[\xc1\x83\xe6\xe4ލ\xd4\xe2\xdeZ\xc0\xe7ep\xe0\x05\xb8¶Z6[B\x01\xe0r\xd6\xef\x7f\xff\xff\xff\t], pid=3073, len=9" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="Sending TCP trace ffffc00083600188, response length 9" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="=== kretprobe_tcp_recvmsg id=3073 copied_len 9 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.498Z level=DEBUG msg="=== return recvmsg id=3073 args=0 copied_len 9 ===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.910Z level=DEBUG msg="=== kprobe tcp_sendmsg=3073 sock=ffff80010e529000 size 5===" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.910Z level=DEBUG msg="[conn] s_h = 0, s_l = b8052a0affff0000, s_port=8080" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.910Z level=DEBUG msg="[conn] d_h = 0, d_l = 23002a0affff0000, d_port=55596" component=BPFLogger pid=3073 comm=uvicorn |  
  |   | time=2025-01-01T18:25:12.910Z level=DEBUG msg="iter_type=5" component=BPFLogger pid=3073 comm=uvicorn

I've also noticed this interesting line:
Image

Its coming from this function which seems to exclude the data from the process entirely:

func (pf *PIDsFilter) Filter(inputSpans []request.Span) []request.Span {
pf.mux.RLock()
defer pf.mux.RUnlock()
// todo: adaptive presizing as a function of the historical percentage
// of filtered spans
outputSpans := make([]request.Span, 0, len(inputSpans))
for i := range inputSpans {
span := &inputSpans[i]
// We first confirm that the current namespace seen by BPF is tracked by Beyla
ns, nsExists := pf.current[span.Pid.Namespace]
if !nsExists {
continue
}
// If the namespace exist, we confirm that we are tracking the user PID that Beyla
// saw. We don't check for the host pid, because we can't be sure of the number
// of container layers. The Host PID is always the outer most layer.
if info, pidExists := ns[span.Pid.UserPID]; pidExists {
if pf.detectOtel {
checkIfExportsOTel(info.service, span)
}
inputSpans[i].Service = *info.service
outputSpans = append(outputSpans, inputSpans[i])
}
}
if len(outputSpans) != len(inputSpans) {
pf.log.Debug("filtered spans from processes that did not match discovery",
"function", "PIDsFilter.Filter", "inLen", len(inputSpans), "outLen", len(outputSpans),
"pids", pf.current, "spans", inputSpans,
)
}
return outputSpans
}

Removing the routes: section and applying this diff didn't fix the issue:

           - direction
     discovery:
       services:
-      - k8s_namespace: .
+      - k8s_namespace: .*
     filter:
       network:
         k8s_dst_owner_name:

I have a hunch its because the non-debug logs are finding a python3.12 process, but the logs are saying comm=uvicorn. But this is about as deep as my knowledge goes. Please let me know if I can help somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant