Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kmesh restart with config change #640

Merged
merged 4 commits into from
Aug 5, 2024
Merged

Conversation

lec-bit
Copy link
Contributor

@lec-bit lec-bit commented Jul 26, 2024

What type of PR is this?
/kind feature

What this PR does / why we need it:

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:


Copy link

codecov bot commented Jul 27, 2024

Codecov Report

Attention: Patch coverage is 55.10204% with 44 lines in your changes missing coverage. Please review.

Project coverage is 50.97%. Comparing base (4a7807b) to head (2d1dc68).
Report is 51 commits behind head on main.

Files Patch % Lines
pkg/controller/workload/workload_processor.go 57.44% 29 Missing and 11 partials ⚠️
pkg/controller/workload/cache/service_cache.go 0.00% 4 Missing ⚠️
Files Coverage Δ
pkg/controller/workload/cache/service_cache.go 0.00% <0.00%> (ø)
pkg/controller/workload/workload_processor.go 58.15% <57.44%> (+14.06%) ⬆️

... and 12 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cd60b90...2d1dc68. Read the comment docs.

return nil
}

func (p *Processor) removeWorkloadResourceByUid(uid string) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest renaming to removeWorkloadFromBpfMap to distinguish with remove from local cache

Copy link
Contributor Author

@lec-bit lec-bit Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, the same interface is called by remove from local chace, I split it out and reused it .

}
backendUid := p.hashName.StrToNum(uid)
// for Pod to Pod access, Pod info stored in frontend map, when Pod offline, we need delete the related records
if err = p.deletePodFrontendData(backendUid); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @nlgwcy Is fixing this, but it think any one can merge first

bv = bpf.BackendValue{}
)

if kmeshbpf.GetStartType() == kmeshbpf.Normal {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we haven't support upgrade, to be safer shall we compare with RESTART only?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i will change it

kmeshbpf.SetStartType(kmeshbpf.Normal)
for str, num := range p.hashName.strToNum {
if p.WorkloadCache.GetWorkloadByUid(str) == nil && p.ServiceCache.GetService(str) == nil {
log.Debugf("GetWorkloadByUid and GetService nil:%v", str)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot assume all the strs are service or pod, can we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I was careless here. I should add a check process in bpf_map.

@hzxuzhonghu
Copy link
Member

cannot see how you make compareWorkloadAndServiceWithHashName only run once? And please add some cases covering different scenarios on changes during restart

@hzxuzhonghu
Copy link
Member

And you did not handle service delete

@lec-bit
Copy link
Contributor Author

lec-bit commented Jul 29, 2024

cannot see how you make compareWorkloadAndServiceWithHashName only run once? And please add some cases covering different scenarios on changes during restart

When restarting, Status=Restart, and set it to Normal after execution to ensure that the function is only executed once.
And current use case is that there is a pod and a service in hashName, which are then processed in compareWorkloadAndServiceWithHashName function.

@hzxuzhonghu
Copy link
Member

@lec-bit Address the comments and we can merge

Signed-off-by: let-bit <[email protected]>
return
}

log.Infof("reload workload config from last start")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.Infof("reload workload config from last start")
log.Infof("reload workload config from last epoch")

kmeshbpf.SetStartType(kmeshbpf.Normal)

// The record exists in the hashName file, exists in Backend or Service bpfmap,
// and does not exist in cache.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusing

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you mean when the str exists in hash file:

  1. if not in the userspace cache, we should delete it from bpf

Signed-off-by: let-bit <[email protected]>
@hzxuzhonghu
Copy link
Member

/lgtm
/approve

Thanks, this is a great improvement.

@kmesh-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kmesh-bot kmesh-bot merged commit 996f216 into kmesh-net:main Aug 5, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants