Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluent Bit 400 Bad Request when integrating with OpenSearch on EKS cluster #810

Open
CSi-CJ opened this issue Apr 24, 2024 · 3 comments
Open

Comments

@CSi-CJ
Copy link

CSi-CJ commented Apr 24, 2024

Bug Report

Describe the bug
I built Fluent-bit in the EKS cluster to integrate AWS opensearch and access kibana across regions, but fluent-bit always reported an error 400 bad request. Your browser sent an invalid request.

To Reproduce

  • Rubular link if applicable:
  • Example log message if applicable:
[2024/04/23 09:22:48] [ info] [input:storage_backlog:storage_backlog.2] queueing tail.0:1-1713791399.265102967.flb
[2024/04/23 09:22:48] [ info] [input:storage_backlog:storage_backlog.2] queueing tail.0:1-1713791418.556741603.flb
[2024/04/23 09:22:48] [ info] [input:storage_backlog:storage_backlog.2] queueing tail.0:1-1713791458.732401079.flb
[2024/04/23 09:22:48] [ warn] [engine] failed to flush chunk '1-1713784556.84415628.flb', retry in 10 seconds: task_id=373, input=storage_backlog.2 > output=opensearch.0 (out_id=0)
[2024/04/23 09:22:48] [ warn] [engine] failed to flush chunk '1-1713784595.472895976.flb', retry in 10 seconds: task_id=725, input=storage_backlog.2 > output=opensearch.0 (out_id=0)
[2024/04/23 09:22:49] [ warn] [engine] chunk '1-1713780558.217934206.flb' cannot be retried: task_id=234, input=storage_backlog.2 > output=opensearch.0
[2024/04/23 09:22:49] [ warn] [engine] failed to flush chunk '1-1713786198.217737377.flb', retry in 14 seconds: task_id=49, input=storage_backlog.2 > output=opensearch.0 (out_id=0)
[2024/04/23 09:22:49] [ warn] [engine] chunk '1-1713784970.684976442.flb' cannot be retried: task_id=792, input=storage_backlog.2 > output=opensearch.0
[2024/04/23 09:22:49] [error] [output:opensearch:opensearch.0] HTTP status=400 URI=/_bulk, response:
<html><body><h1>400 Bad request</h1>
Your browser sent an invalid request.
</body></html>

[2024/04/23 09:22:49] [ warn] [engine] failed to flush chunk '1-1713786418.224288613.flb', retry in 16 seconds: task_id=908, input=storage_backlog.2 > output=opensearch.0 (out_id=0)
[2024/04/23 09:22:49] [ warn] [engine] failed to flush chunk '1-1713785331.83017891.flb', retry in 25 seconds: task_id=841, input=storage_backlog.2 > output=opensearch.0 (out_id=0)
[2024/04/23 09:22:49] [ warn] [engine] chunk '1-1713780668.223932764.flb' cannot be retried: task_id=25, input=storage_backlog.2 > output=opensearch.0
[2024/04/23 09:22:49] [error] [output:opensearch:opensearch.0] HTTP status=400 URI=/_bulk, response:
<html><body><h1>400 Bad request</h1>
Your browser sent an invalid request.
</body></html>

[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87bed30 75 in the next 5 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87bf1b8 642 in the next 4 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87bb400 35 in the next 7 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87c6c60 291 in the next 11 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87c73b8 962 in the next 25 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87c3da8 260 in the next 6 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87ca400 211 in the next 5 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87bfe10 298 in the next 5 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87cb328 967 in the next 20 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87d22b8 839 in the next 8 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87c0040 121 in the next 12 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87ca630 1058 in the next 9 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87bdea8 581 in the next 10 seconds
[2024/04/23 09:22:49] [ info] [task] re-schedule retry=0x7f9da87d0af8 1059 in the next 7 seconds
  • Steps to reproduce the problem:
  • prepare two AWS accounts (optional)
  • follow my configuration to build fluent-bit as below

Expected behavior
It is expected that the collected logs will be printed correctly in the fluent-bit pod and the output log files will be seen in kibana.

Screenshots
image
image

Your Environment

  • Version used: public.ecr.aws/aws-observability/aws-for-fluent-bit:stable
  • Configuration:
apiVersion: v1
kind: ConfigMap
metadata:
  name: fluent-bit-config
  namespace: logging
  labels:
    k8s-app: fluent-bit
data:
  fluent-bit.conf: |
    [SERVICE]
        Flush                     5
        Grace                     30
        Log_Level                 info
        Daemon                    off
        Parsers_File              parsers.conf
        HTTP_Server               ${HTTP_SERVER}
        HTTP_Listen               0.0.0.0
        HTTP_Port                 ${HTTP_PORT}
        storage.path              /var/fluent-bit/state/flb-storage/
        storage.sync              normal
        storage.checksum          off
        storage.backlog.mem_limit 5M
        
    @INCLUDE application-log.conf
  
  application-log.conf: |
    [INPUT]
        Name                tail
        Tag                 application.*
        Exclude_Path        /var/log/containers/cloudwatch-agent*, /var/log/containers/fluent-bit*, /var/log/containers/aws-node*, /var/log/containers/kube-proxy*
        Path                /var/log/containers/*.log
        multiline.parser    docker, cri
        DB                  /var/fluent-bit/state/flb_container.db
        Mem_Buf_Limit       50MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Rotate_Wait         30
        storage.type        filesystem
        Read_from_Head      ${READ_FROM_HEAD}

    [INPUT]
        Name                tail
        Tag                 application.*
        Path                /var/log/containers/fluent-bit*
        multiline.parser    docker, cri
        DB                  /var/fluent-bit/state/flb_log.db
        Mem_Buf_Limit       5MB
        Skip_Long_Lines     On
        Refresh_Interval    10
        Read_from_Head      ${READ_FROM_HEAD}

    [FILTER]
        Name                kubernetes
        Match               application.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_Tag_Prefix     application.var.log.containers.
        Merge_Log           On
        Merge_Log_Key       log_processed
        K8S-Logging.Parser  On
        K8S-Logging.Exclude Off
        Labels              Off
        Annotations         Off
        Use_Kubelet         On
        Kubelet_Port        10250
        Buffer_Size         0

    [OUTPUT]
        Name opensearch
        Match application.*
        Host vpc-xxxxx.us-west-2.es.amazonaws.com
        Port 443
        Logstash_Format On
        Logstash_Prefix kube
        Logstash_DateFormat %Y.%m.%d.%H
        Retry_Limit False
        tls On
        AWS_Auth On
        AWS_Region ${AWS_REGION}
        Suppress_Type_Name On
        Type  _doc
        Trace_Error       On
        Replace_Dots      On

  parsers.conf: |
    [PARSER]
        Name                syslog
        Format              regex
        Regex               ^(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$
        Time_Key            time
        Time_Format         %b %d %H:%M:%S

    [PARSER]
        Name                container_firstline
        Format              regex
        Regex               (?<log>(?<="log":")\S(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
        Time_Key            time
        Time_Format         %Y-%m-%dT%H:%M:%S.%LZ

    [PARSER]
        Name                cwagent_firstline
        Format              regex
        Regex               (?<log>(?<="log":")\d{4}[\/-]\d{1,2}[\/-]\d{1,2}[ T]\d{2}:\d{2}:\d{2}(?!\.).*?)(?<!\\)".*(?<stream>(?<="stream":").*?)".*(?<time>\d{4}-\d{1,2}-\d{1,2}T\d{2}:\d{2}:\d{2}\.\w*).*(?=})
        Time_Key            time
        Time_Format         %Y-%m-%dT%H:%M:%S.%LZ
  • Environment name and version (e.g. Kubernetes? What version?): kubernetes:1.25
  • Server type and version: eks.17
  • Operating System and version: AMI AL2_x86_64
  • Filters and plugins:

Additional context
Hope someone can help me integrate EKS and OpenSearch correctly

@CSi-CJ
Copy link
Author

CSi-CJ commented May 24, 2024

@PettitWesley Can you give me some advice to help me solve this problem?

I tried the small sample configuration to test fluent-bit and OpenSearch communication, but still got the same error.

[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    debug
[INPUT]
    name tail
    path /fluent-bit/test.log
[OUTPUT]
    name stdout
    match *
[OUTPUT]
    Name            es
    Match           *
    Host            ${OPENSEARCH_HOST}
    Port            443
    Logstash_Format On
    Logstash_Prefix test
    tls.verify      Off
    Index           test-index
    Type            _doc
    tls             On
    AWS_Auth        On
    AWS_Region      ${AWS_REGION}

error log:

[2024/05/24 05:34:58] [debug] [upstream] KA connection #49 to aos-xxxxxxxxxx-qut34njbolfjttfezukhresuwu.us-west-2.es.amazonaws.com:443 is now available
[2024/05/24 05:34:58] [debug] [out flush] cb_destroy coro_id=32
[2024/05/24 05:34:58] [debug] [retry] new retry created for task_id=0 attempts=1
[2024/05/24 05:34:58] [ warn] [engine] failed to flush chunk '697-1716528898.139281660.flb', retry in 11 seconds: task_id=0, input=tail.0 > output=opensearch.1 (out_id=1)
[2024/05/24 05:35:08] [debug] [input:tail:tail.0] inode=9444188, /fluent-bit/test.log, events: IN_MODIFY
[2024/05/24 05:35:08] [debug] [input chunk] update output instances with new chunk size diff=58, records=1, input=tail.0
[2024/05/24 05:35:08] [debug] [task] created task=0x7fc61fda0ca0 id=1 OK
[2024/05/24 05:35:08] [debug] [output:stdout:stdout.0] task_id=1 assigned to thread #0
[0] tail.0: [1716528908.140517765, {"log"=>"{"name": "hello", "test": "hello world"}"}]
[2024/05/24 05:35:08] [debug] [upstream] KA connection #48 to aos-xxxxxxxxxx-qut34njbolfjttfezukhresuwu.us-west-2.es.amazonaws.com:443 has been assigned (recycled)
[2024/05/24 05:35:08] [debug] [http_client] not using http_proxy for header
[2024/05/24 05:35:08] [debug] [output:opensearch:opensearch.1] Signing request with AWS Sigv4
[2024/05/24 05:35:08] [debug] [out flush] cb_destroy coro_id=17
[2024/05/24 05:35:08] [debug] [aws_credentials] Requesting credentials from the env provider..
[2024/05/24 05:35:08] [debug] [output:opensearch:opensearch.1] HTTP Status=400 URI=/_bulk
[2024/05/24 05:35:08] [error] [output:opensearch:opensearch.1] HTTP status=400 URI=/_bulk, response:
<html><body><h1>400 Bad request</h1>
Your browser sent an invalid request.
</body></html>

@vaibhavops
Copy link

same issue kindly help.

@sanjinp
Copy link

sanjinp commented Nov 4, 2024

+1 same issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants