Consider frequency of "/localnodes" fetches on idle clients #46

nyh · 2024-11-24T11:30:09Z

Currently it seems all our load balancer implementations (I checked Python, Java and Go), update their list of live nodes (via the "/localnodes" call) once every second. Updating this list frequently is important - so we don't continue sending requests to a dead node for a long time, and also to quickly discover nodes coming up. It's also a reasonably cheap request, and if a client does 100 requests per second then doing one more each second (and a specially cheap one like /localnodes) is negligible.

However there is one situation where doing a "/localnodes" request every second isn't negligible: It is the case where we have a lot of idle client processes. Perhaps it's worthwhile recognizing this case and not do a "/localnodes" request every second from a client library that knows it is idle. We could lower the frequency in this case, say from 1 second to 60 seconds - but this also has the obvious downsides like continuing to send requests to a dead node for a whole minute. So perhaps we can consider a different approach: Do "/localnodes" requests rarely (e.g., once an hour), and also do another /localnodes request right after executing the first client request in a particular second. The rationale behind this proposal is:

Excecuting a /localnodes before a client request will increase the request's latency, which is undesirable.
If our list of nodes is outdated, the user request may fail and the client will retry the request (the AWS driver does this automatically). Because the load balancer will do a /localnodes after the request, this will ensure that the retry (if not done too quickly) will get an up-to-date node list.
If we were to never run /localnodes and use a one-week old node list, it is theoretically possible that all of them have changed since, and we won't know any live node and won't be able to refresh our list. This is why it makes sense to infrequently refresh the list (e.g., once an hour) to notice when the cluster is undergoing major changes and not allow a situation where we didn't run /localnodes for a week.

nyh added the enhancement New feature or request label Nov 24, 2024

dkropachev mentioned this issue Nov 27, 2024

Add rack/dc aware load balancing #40

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider frequency of "/localnodes" fetches on idle clients #46

Consider frequency of "/localnodes" fetches on idle clients #46

nyh commented Nov 24, 2024 •

edited

Loading

Consider frequency of "/localnodes" fetches on idle clients #46

Consider frequency of "/localnodes" fetches on idle clients #46

Comments

nyh commented Nov 24, 2024 • edited Loading

nyh commented Nov 24, 2024 •

edited

Loading