You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Experiencing seemingly a fluent-bit related bug (low frequency and sporadic) where in the FB pod is not correctly sending logs from the node. Additionally the node disk space is slowly filled up where flb files are leaked onto the disk. The affected FB pod stays in RUNNING state even after a SIGTERM is received.
The fluent-bit engine shutdown after 5 seconds, however, child processes/tasks such as input:tail:tail.0 kept running and collecting flb files. The container was left running in a non-working state until manual intervention.
Fluent Bit Log Output
[engine] caught signal (SIGTERM)
[ info] [input] pausing tail.0
[ info] [input] pausing tail.1
[ info] [input] pausing tail.2
[ info] [input] pausing systemd.3
[ info] [input] pausing tail.4
[ info] [input] pausing tail.5
[ info] [input] pausing tail.6
[ info] [input] pausing tail.7
[ info] [input] pausing storage_backlog.8
[ warn] [engine] service will shutdown in max 5 seconds
[ info] [task] tail/tail.0 has 128 pending task(s):
...
[ info] [task] task_id=0 still running on route(s): cloudwatch_logs/cloudwatch_logs.0
[ info] [task] task_id=1 still running on route(s): cloudwatch_logs/cloudwatch_logs.0
[ info] [task] task_id=2 still running on route(s): cloudwatch_logs/cloudwatch_logs.0
[ info] [task] task_id=3 still running on route(s): cloudwatch_logs/cloudwatch_logs.0
[ info] [task] task_id=4 still running on route(s): cloudwatch_logs/cloudwatch_logs.0
[ info] [task] task_id=5 still running on route(s): cloudwatch_logs/cloudwatch_logs.0
[ info] [task] task_id=6 still running on route(s): cloudwatch_logs/cloudwatch_logs.0
[ info] [task] task_id=7 still running on route(s): cloudwatch_logs/cloudwatch_logs.0
[ info] [task] task_id=8 still running on route(s): cloudwatch_logs/cloudwatch_logs.0
[ info] [task] task_id=9 still running on route(s): cloudwatch_logs/cloudwatch_logs.0
...
[ info] [engine] service has stopped (215 pending tasks)
[output:cloudwatch_logs:cloudwatch_logs.0] thread worker #0 stopping...
Below showing the files leaked to the disk:
root@ip-:/var/fluent-bit/state/flb-storage/tail.0# while true; do echo "number of flb files" $(ls -1 | wc -l); sleep 1; done
number of flb files 5871
number of flb files 5866
number of flb files 5862
number of flb files 5859
number of flb files 5860
number of flb files 5856
number of flb files 5854
Describe the question/issue
Experiencing seemingly a fluent-bit related bug (low frequency and sporadic) where in the FB pod is not correctly sending logs from the node. Additionally the node disk space is slowly filled up where
flb
files are leaked onto the disk. The affected FB pod stays in RUNNING state even after a SIGTERM is received.The fluent-bit engine shutdown after 5 seconds, however, child processes/tasks such as input:tail:tail.0 kept running and collecting flb files. The container was left running in a non-working state until manual intervention.
Fluent Bit Log Output
Fluent Bit Version Info
Cluster Details
Application Details
Steps to reproduce issue
Related Issues
The text was updated successfully, but these errors were encountered: