Skip to content

Conversation

derekwin
Copy link
Contributor

What type of PR is this?
/kind enhancement

What this PR does / why we need it:
add proposal for Locality LB

@kmesh-bot kmesh-bot added the kind/enhancement New feature or request label Jul 15, 2024
@kmesh-bot
Copy link
Collaborator

Welcome @derekwin! It looks like this is your first PR to kmesh-net/kmesh 🎉

@LiZhenCheng9527
Copy link
Contributor

Would you like to share your issue at Thursday's community meeting?


### Motivation

Currently, kmesh does not support locality topology-aware load balancing. Locality Load Balancing optimizes performance and reliability in distributed systems by directing traffic to the nearest service instances. This reduces latency, enhances availability, and lowers costs associated with cross-region data transfers. It also helps ensure compliance with data sovereignty regulations and improves overall user experience by providing faster and more reliable service responses.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Currently, kmesh does not support locality topology-aware load balancing. Locality Load Balancing optimizes performance and reliability in distributed systems by directing traffic to the nearest service instances. This reduces latency, enhances availability, and lowers costs associated with cross-region data transfers. It also helps ensure compliance with data sovereignty regulations and improves overall user experience by providing faster and more reliable service responses.
Currently, Kmesh does not support locality topology-aware load balancing. Locality Load Balancing optimizes performance and reliability in distributed systems by directing traffic to the nearest service instances. This reduces latency, enhances availability, and lowers costs associated with cross-region data transfers. It also helps ensure compliance with data sovereignty regulations and improves overall user experience by providing faster and more reliable service responses.

Unified capitalisation of initial letters in Kmesh


#### case 1. locality failover
1. Destination Rule
Same as Istion. Parse rules specify configuration for Locality load balancing. (todo: outlier detection settings to detect and evict unhealthy hosts from the load balancing pool.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is istion? Istio?

@hzxuzhonghu
Copy link
Member

/ok-to-test

Copy link

codecov bot commented Jul 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 52.80%. Comparing base (433592b) to head (b2caa7c).
Report is 217 commits behind head on main.

see 29 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update dda7049...b2caa7c. Read the comment docs.


Currently, kmesh does not support locality topology-aware load balancing. Locality Load Balancing optimizes performance and reliability in distributed systems by directing traffic to the nearest service instances. This reduces latency, enhances availability, and lowers costs associated with cross-region data transfers. It also helps ensure compliance with data sovereignty regulations and improves overall user experience by providing faster and more reliable service responses.

#### Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Goals
### Goals


1. prioritize add locality load balancing capabilities in the workload mode.

2. two types of locality load balancing : locality failover, locality weighted distribution.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure how locality weighted distribution can be implemented in workload mode. The workload api does not support weight actually

#### case 1. locality failover
1. Destination Rule
Same as Istion. Parse rules specify configuration for Locality load balancing. (todo: outlier detection settings to detect and evict unhealthy hosts from the load balancing pool.)
- Outlier detection should occur before load balancing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not suite worklaod mode as workload api does not include outlier setting. It do LB based on where the endpoint resides.

@derekwin
Copy link
Contributor Author

Would you like to share your issue at Thursday's community meeting?

yes

@kmesh-bot kmesh-bot added size/L and removed size/M labels Jul 25, 2024
@derekwin
Copy link
Contributor Author

I have updated the proposal.

@derekwin
Copy link
Contributor Author

Propose a new implementation for a location matching algorithm that avoids circular computations while also reducing the amount of data needed to be stored in BPF maps. detail: https://github.com/derekwin/treemap/tree/master
Welcome to offer suggestions to further improve the approach.

@Okabe-Rintarou-0
Copy link
Member

Okabe-Rintarou-0 commented Jul 30, 2024

if no conflict, there is no need to merge main branch.
If there are some conflicts, to get a clearer commit history, you should:

git rebase main

then fix some conflicts, and then

git rebase --continue
git push --force

the DCO github action failed, because it asks you to commit with your signature, which can be attached with -s flag:

git commit -s -m 'something to say'

Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish to see more api design instead of function implement in the proposal

How do you express the priority level, and how do you match the client locality with the endpoints


1. prioritize add locality load balancing capabilities in the workload mode.

2. locality load balancing mode: locality failover.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about strict mode

```
https://pkg.go.dev/istio.io/istio/pkg/workloadapi#LoadBalancing_Scope

2. calculate locality match rank
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

group endpoints with prority


3. choose endpoint

Randomly select one endpoint from the group with the highest rank as the service backend.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Randomly select one endpoint from the group with the highest rank as the service backend.
Randomly select one endpoint from the group with the highest priority

And add more comments what we do if all the endpoints of high priority is unhealthy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for the strict mode, how would you select the endpoint, i would like to see that


4. maybe more? Panic threshold

When the proportion of healthy endpoints in the high-rank group falls below the panic threshold, select endpoints from the next rank group.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I donot care about this at first. First respect workload healthy status

__u32 waypoint_addr;
__u32 waypoint_port;
// 增加健康状态 healthStatus
// 增加locality信息
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please supplement what does this field look like.

@derekwin
Copy link
Contributor Author

Yeah, we need user guide for locality loadbalancer, you should file it into website repo. BTW, this is a refer for you https://istio.io/latest/docs/tasks/traffic-management/locality-load-balancing/是的,我们需要 Locality LoadBalancer 的用户指南,您应该将其归档到 Website Repo 中。顺便说一句,这是为您推荐的 https://istio.io/latest/docs/tasks/traffic-management/locality-load-balancing/

ok.

@derekwin
Copy link
Contributor Author

I have updated the content of this proposal, and the documentation in the website repo will be updated soon.

@derekwin
Copy link
Contributor Author

@hzxuzhonghu

Copy link
Contributor

@LiZhenCheng9527 LiZhenCheng9527 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost lgtm

7. For the random policy, all endpoints are marked with a priority of 0. For failover or strict policy, the priority is set to 0 for the endpoint with the highest match according to the `routingPreference`.

#### control flow
![locality_lb_pic](pics/locality_lb.svg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please center the image

@LiZhenCheng9527
Copy link
Contributor

/lgtm

derekwin and others added 9 commits November 27, 2024 16:29
Signed-off-by: seclee <[email protected]>
Signed-off-by: derekwin <[email protected]>
Signed-off-by: derekwin <[email protected]>
Signed-off-by: derekwin <[email protected]>
Co-authored-by: lizhencheng <[email protected]>
Signed-off-by: derekwin <[email protected]>
Signed-off-by: derekwin <[email protected]>
@LiZhenCheng9527
Copy link
Contributor

/lgtm

@kmesh-bot kmesh-bot added the lgtm label Nov 27, 2024
@LiZhenCheng9527
Copy link
Contributor

@hzxuzhonghu Can this PR be merged?

Copy link
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kmesh-bot kmesh-bot merged commit 1727d3c into kmesh-net:main Dec 13, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants