Skip to content

fluent-bit not updating pod metadata when pod restarts/updates #11961

Description

@danfinn

Bug Report

Describe the bug
A kubernetes statefulset had been upgraded which along with it changed the values for the labels of the pods. One of the pods in the statefulset was still shipping logs with the old pod metadata (pod labels, container tag, pod IP) which made it appear in grafana that this pod had not been updated but when you checked the details of the pod on the kubernetes cluster it showed it had been updated and was running the newer version.

It took us a while to track to why our grafana logs from this pod were still appearing as coming from the old version but we eventually tracked it down to fluent-bit. Restarting the fluent-bit pod that is running on the same node as this statefulset pod fixed the issue and corrected the problems with the pod metadata in the logs.

Here is a log viewed from grafana that was shipped via fluent-bit:

{
  "@timestamp": "2026-06-18T15:11:43.021Z",
  "_p": "F",
  "kubernetes.container_hash": "dev.azurecr.io/pw-app-windows@sha256:90bb1ca8e481270412503dd5dbc0b03add41b6cc3ea6428221fdba4d5835d5fb",
  "kubernetes.container_image": "dev.azurecr.io/pw-app-windows:25.0.5-stable.8032",
  "kubernetes.container_name": "pw-app-app-pw",
  "kubernetes.docker_id": "5c86a7792a8bc90d5e8a60d40ab5ca2008469e3c47872dc7481b89d0615da72f",
  "kubernetes.host": "akswp2200006p",
  "kubernetes.labels.app_name": "app-pw",
  "kubernetes.labels.apps_kubernetes_io/pod-index": "1",
  "kubernetes.labels.controller-revision-hash": "app-pw-596b649d4f",
  "kubernetes.labels.namespace": "pw-customer-1001383261",
  "kubernetes.labels.region": "southcentralus",
  "kubernetes.labels.statefulset_kubernetes_io/pod-name": "app-pw-1",
  "kubernetes.labels.version": "25.0.5-stable.8032",
  "kubernetes.namespace_name": "pw-customer-1001383261",
  "kubernetes.pod_id": "a7f564c4-2395-43db-96dd-06117f45e69a",
  "kubernetes.pod_ip": "10.27.249.145",
  "kubernetes.pod_name": "app-pw-1",
  "kubernetes_namespace.labels.azure-key-vault-env-injection": "enabled",
  "kubernetes_namespace.labels.deployment-name": "xorailpw11",
  "kubernetes_namespace.labels.kubernetes_io/metadata_name": "pw-customer-1001383261",
  "kubernetes_namespace.labels.plan-id": "66e88ef473e06f4318bdc3a5",
  "kubernetes_namespace.labels.purpose": "pw-customer-1001383261",
  "kubernetes_namespace.labels.ultimate-id": "1001383261",
  "kubernetes_namespace.name": "pw-customer-1001383261",
  "message": "2026-06-18 15:11:43.021000000 +0000 log.pwapp.bentleylogs: {\"message\":\"2026-06-18 15:11:43,021 ERROR [0x00009f78] pwise.socket - aaSockSec_ReadFromSocket returns -10101\"}",
  "stream": "stdout"
}

and here is a describe on the pod as it's currently running from the k8s clusters:

Name:             app-pw-1
Namespace:        pw-customer-1001383261
Priority:         0
Service Account:  default
Node:             akswp2200006p/10.27.249.104
Start Time:       Wed, 03 Jun 2026 01:32:32 -0400
Labels:           app=app-pw
                  apps.kubernetes.io/pod-index=1
                  controller-revision-hash=app-pw-675b4f4ffb
                  namespace=pw-customer-1001383261
                  region=southcentralus
                  statefulset.kubernetes.io/pod-name=app-pw-1
                  version=26.0.0-stable.8017
Annotations:      checksum/pw-app-configmap: 0ccb05a3780d6998a15a509f7c33f14b26dd97bedc5df2c816890d800fa8e805
                  checksum/pw-app-dmskrnl-configmap: a0d371a59ea2117cdb2b98e261d3bdc53e5b9a260a46680240e616bcf69abc64
                  kubectl.kubernetes.io/restartedAt: 2025-09-03T13:16:21-06:00
                  prometheus.io/path: /metrics
                  prometheus.io/port: 9182
                  prometheus.io/scrape: true
                  secret.reloader.stakater.com/reload: pw-app-tls-cert
Status:           Running
IP:               10.27.249.116
IPs:
  IP:           10.27.249.116
Controlled By:  StatefulSet/app-pw
Containers:
  pw-app-app-pw:
    Container ID:  containerd://aab627760cba61d8a37ea2c8e557dcefb87a92e9d6e8df3cbc18628131359426
    Image:         devbentleyhosted.azurecr.io/pw-app-windows:26.0.0-stable.8017
    Image ID:      devbentleyhosted.azurecr.io/pw-app-windows@sha256:e991398d0a6e889cad022fbd7118b9b3ffcb27f7b00f283f99502726177c9b7e

For example, you can see in the grafana logs that the label for version in the logs coming into grafana are from the "old" (25.0.5-stable.8032) version before this statefulset was updated (to 26.0.0-stable.8017).

Restarting the fluent-bit running on the k8s node that this pod is on fixes the issue and the logs come into our obsv with the correct and up to date labels.

We are running fluent-bit on windows kubernetes nodes in Azure AKS. We build our fluent-bit windows daemonset from this image:
FROM fluent/fluent-bit:windows-2022-4.1.0

Expected behavior
fluent-bit should update pod metadata when a pod is restarted

Screenshots

Your Environment
mentioned above

Additional context
We use this data to track application versions across deployments in different k8s clusters. Currently we cannot trust this data because of this issue with fluent-bit.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions