Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caught signal (SEVSEGV) in EKS 1.30 #880

Open
rsantacreu-ust opened this issue Dec 16, 2024 · 0 comments
Open

caught signal (SEVSEGV) in EKS 1.30 #880

rsantacreu-ust opened this issue Dec 16, 2024 · 0 comments

Comments

@rsantacreu-ust
Copy link

Describe the question/issue

Where experiencing several restarts on aws-for-fluentbit component on eks

Configuration

Name: aws-for-fluent-bit-pro-ms
Namespace: monitoring
Labels: app.kubernetes.io/instance=aws-for-fluent-bit-pro-ms
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=aws-for-fluent-bit
app.kubernetes.io/version=2.32.2.20240516
argocd.argoproj.io/instance=aws-for-fluent-bit-pro-ms
helm.sh/chart=aws-for-fluent-bit-0.1.34
Annotations:

Data

fluent-bit.conf:

[SERVICE]
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_PORT 2020
Health_Check On
HC_Errors_Count 5
HC_Retry_Failure_Count 5
HC_Period 5

Parsers_File /fluent-bit/parsers/parsers.conf

[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/.log
DB /var/log/flb_kube.db
multiline.parser docker, cri
Mem_Buf_Limit 100MB
Skip_Long_Lines On
Refresh_Interval 10
[FILTER]
Name kubernetes
Match kube.

Kube_URL https://kubernetes.default.svc.cluster.local:443
Merge_Log On
Merge_Log_Key data
Keep_Log On
K8S-Logging.Parser On
K8S-Logging.Exclude On
Buffer_Size 128k
[OUTPUT]
Name cloudwatch_logs
Match *
region eu-south-2
log_group_name /mylogroupname
log_stream_prefix fluentbit-
auto_create_group true
log_retention_days 7
[OUTPUT]
Name s3
Match *
bucket my-bucket
region eu-south-2
json_date_key date
json_date_format iso8601
total_file_size 100M
upload_chunk_size 6M
upload_timeout 10m
store_dir /tmp/fluent-bit/s3
s3_key_format /prod/pro-ms-001/$TAG/%Y-%m-%d/%H-%M-%S

auto_retry_requests           true
preserve_data_ordering        true
retry_limit                   1
external_id                   095761336548

BinaryData

Events:

Fluent Bit Log Output

[2024/12/16 12:23:55] [engine] caught signal (SIGSEGV)
[2024/12/16 12:23:54] [ info] [output:s3:s3.1] Successfully uploaded part[2024/12/16 12:23:54] [ info] [output:s3:s3.1] UploadPart http status=200
[2024/12/16 12:23:53] [ info] [output:s3:s3.1] Successfully uploaded object /mylogroup/kube.var.log.containers.

Fluent Bit Version Info

Cluster Details

Is throttling from the destination part of the problem? Please note that occasional transient network connection errors are often caused by exceeding limits. For example, CW API can block/drop Fluent Bit connections when throttling is triggered. Checked GetLogEvents and PutLogEvents and we are far away from the limit from AWS

  • EKS
  • EC2
  • DaemonSet

Application Details

Using helm chart 0.34

https://github.com/aws/eks-charts/tree/master/stable/aws-for-fluent-bit

Steps to reproduce issue

Related Issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant