-
-
Notifications
You must be signed in to change notification settings - Fork 32
Conversation
targetName = addr.TargetRef.Name | ||
} | ||
ls := &lockers.Service{ | ||
ID: fmt.Sprintf("%s-api", targetName), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's safe to potentially have two services with the same ID
? This would happen in the event that a pod had multiple addresses assigned to it. Obviously the Address
field would be different, but since we base the ID
on the pod's hostname, it would match for every address on the pod.
Thanks for doing this, I took the lazy approach when implementing the k8s locker just to get it to work. |
Instead of using two API calls to get the service and then the associated pods, we can use just one to get the endpoints associated with the service we care about. This change requires a different Role to be bound. There is no more need for the services or pods endpoint but endpoints needs to be added. An entire role might now look like this: ```yaml apiVersion: rbac.authorization.k8s.io/v1 kind: Role metadata: name: gnmic rules: - apiGroups: - coordination.k8s.io resources: - leases verbs: - get - list - update - create - patch - delete - watch - apiGroups: - '' resources: - endpoints verbs: - get - list - watch ```
Instead of fetching the object on a schedule, we can ask Kubernetes to send us changes as they happen. This is very similar to the Consul method. The first response we get will include the ResourceVersion of the current object, which we can use in subsequent requests to restart the stream of object updates from where we left off. This also changes the semantics of the watch timer to match the Consul locker. Since we no longer poll for updates, this timer becomes a timeout on the watch API call. This also adds a retry timer that is used in case of an API error, again the same as the Consul locker.
I found these messages to be far too chatty and they don't appear to be helpful in healthy operation.
I just realized that these commits had the wrong email address, so I forced-pushed to fix that. No other changes. |
Appart from the comment I left, this looks good to me. |
I think this should handle the |
Thanks again for the contribution! |
I was excited to see the Kubernetes clustering feature from #560 in 0.25.0. When testing, I noticed the frequent service update log messages and went exploring in the code to understand why it was so much chattier than my previous Consul deployment, and what I found became this PR.
There's a detailed explanation in each commit here (especially note the RBAC changes required by the first commit), but the upshot of the changes is that we can use Endpoint objects directly to avoid having to find the gNMIc pods ourselves. We get some extra benefits for free with Endpoints: non-ready pod addresses get removed from the cluster's discovered services without us having to do the work of finding them. We can then use "watches" on these objects instead of scraping them repeatedly, which gives us realtimeish detection of changes and -- ultimately what I was looking for -- a quiet log stream in the steady state because we only send updates when things actually change.
I did not see any documentation for the K8s locker yet, but if I just missed it, that will need to be updated too.
This is my first time both with the Kubernetes API and with any nontrivial amount of Go code, so don't be shy about being critical if I've done something wrong.