-
Notifications
You must be signed in to change notification settings - Fork 779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Observing "Group Authorization Failed" error for every 12 hours #1093
Comments
Hi @gopi-bathala, are you still experiencing this issue? If so could you please include the version of the aws_msk_iam module you're using? Thanks! |
Hi @rhansen2 , Thanks for your reply Yes, we are still observing the rebalance issue occurring at every 12 hours and below is the aws_msk_iam module version that is indirectly referenced in go.mod file github.com/segmentio/kafka-go/sasl/aws_msk_iam_v2 v0.0.0-20230127181734-172fe7593625 |
I thought initially you may be experiencing #976 but it seems like you're using a version that contains that fix. Do you need to restart your consumers when you encounter this error or do thing self correct? Is your IAM session ttl set to 12 hours? It's possible what's occurring is that when the credentials the connection first used expire, the connection can no longer be used which is triggering heartbeats to fail and the group to rebalance. |
consumers are auto restarted after that heartbeat error. I will check and come back on IAM TTL. Possibly that could be the reason for this timeouts |
We experience this same issue, though our TTL is 1h, so we get a spew of errors ever hour. After failing, it will get refresh the token then retry the writes which go through fine - but since these all go to the error logger it's cluttering things up. |
We should implement KIP-368 to solve this. The way other clients appear to handle this (franz-go example) is to check the expiration during requests and re authenticate if the expiration time is within some threshold. |
Describe the bug
Hi, we are using kafka-go library in our application and we observed there is a strange rebalance occurs at every 12 hours for the consumer group. Below is the error message observed. Could you please check on this issue please? Is this related to any incorrect configurations?
Errors:
Kafka Version
Attached the cloudwatch logs for your quick analysis
The text was updated successfully, but these errors were encountered: