Observing "Group Authorization Failed" error for every 12 hours #1093

gopi-bathala · 2023-03-09T07:22:19Z

Describe the bug

Hi, we are using kafka-go library in our application and we observed there is a strange rebalance occurs at every 12 hours for the consumer group. Below is the error message observed. Could you please check on this issue please? Is this related to any incorrect configurations?

Errors:

"Group Authorization Failed: the client is not authorized to access a particular group id"
EOF
Use of closed network connection

Kafka Version

Kafka version 3.2.0
Kafka-go version 0.4.38

Attached the cloudwatch logs for your quick analysis

rhansen2 · 2023-05-19T16:02:39Z

Hi @gopi-bathala, are you still experiencing this issue?

If so could you please include the version of the aws_msk_iam module you're using?

Thanks!

gopi-bathala · 2023-05-19T20:56:01Z

Hi @rhansen2 ,

Thanks for your reply

Yes, we are still observing the rebalance issue occurring at every 12 hours and below is the aws_msk_iam module version that is indirectly referenced in go.mod file

github.com/segmentio/kafka-go/sasl/aws_msk_iam_v2 v0.0.0-20230127181734-172fe7593625

rhansen2 · 2023-05-19T21:35:35Z

I thought initially you may be experiencing #976 but it seems like you're using a version that contains that fix. Do you need to restart your consumers when you encounter this error or do thing self correct?

Is your IAM session ttl set to 12 hours? It's possible what's occurring is that when the credentials the connection first used expire, the connection can no longer be used which is triggering heartbeats to fail and the group to rebalance.

gopi-bathala · 2023-05-23T13:45:45Z

consumers are auto restarted after that heartbeat error. I will check and come back on IAM TTL. Possibly that could be the reason for this timeouts

jcarter3 · 2023-07-07T16:44:52Z

We experience this same issue, though our TTL is 1h, so we get a spew of errors ever hour. After failing, it will get refresh the token then retry the writes which go through fine - but since these all go to the error logger it's cluttering things up.

petedannemann · 2023-08-29T13:15:36Z

We should implement KIP-368 to solve this. The way other clients appear to handle this (franz-go example) is to check the expiration during requests and re authenticate if the expiration time is within some threshold.

gopi-bathala added the bug label Mar 9, 2023

rhansen2 self-assigned this May 12, 2023

petedannemann mentioned this issue Aug 29, 2023

SASL Authentication Failed Error #1183

Open

scott-the-programmer mentioned this issue Nov 10, 2023

Implementation of session timeout in V1 SASL Authentication #1227

Closed

ertanden linked a pull request Nov 10, 2023 that will close this issue

feat: sasl reauthentication #1230

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Observing "Group Authorization Failed" error for every 12 hours #1093

Observing "Group Authorization Failed" error for every 12 hours #1093

gopi-bathala commented Mar 9, 2023

rhansen2 commented May 19, 2023

gopi-bathala commented May 19, 2023

rhansen2 commented May 19, 2023

gopi-bathala commented May 23, 2023

jcarter3 commented Jul 7, 2023

petedannemann commented Aug 29, 2023

Observing "Group Authorization Failed" error for every 12 hours #1093

Observing "Group Authorization Failed" error for every 12 hours #1093

Comments

gopi-bathala commented Mar 9, 2023

rhansen2 commented May 19, 2023

gopi-bathala commented May 19, 2023

rhansen2 commented May 19, 2023

gopi-bathala commented May 23, 2023

jcarter3 commented Jul 7, 2023

petedannemann commented Aug 29, 2023