Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create additional NEGs based on subnets in Node Topology CR. #2702

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

sawsa307
Copy link
Contributor

  • Query Node Topology CR for the current set of NEGs in the cluster.
  • When ensureNetworkEndpointGroups(), ensure NEGs are properly provisioned in the non-default subnets as well.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Oct 18, 2024
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Oct 18, 2024
@sawsa307 sawsa307 marked this pull request as draft October 18, 2024 20:18
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 18, 2024
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Oct 22, 2024
@sawsa307 sawsa307 marked this pull request as ready for review October 22, 2024 15:53
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 22, 2024
@sawsa307
Copy link
Contributor Author

/assign @gauravkghildiyal

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sawsa307
Once this PR has been reviewed and has the lgtm label, please ask for approval from gauravkghildiyal. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sawsa307 sawsa307 force-pushed the create-non-default-neg branch 2 times, most recently from 644e229 to 264bd6c Compare October 23, 2024 20:44
pkg/neg/types/types_test.go Outdated Show resolved Hide resolved
pkg/utils/zonegetter/zone_getter.go Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
pkg/neg/syncers/utils.go Outdated Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
@sawsa307
Copy link
Contributor Author

/label tide/merge-method-squash

@k8s-ci-robot k8s-ci-robot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Oct 31, 2024
Copy link
Contributor Author

@sawsa307 sawsa307 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just addressed all the comments, @gauravkghildiyal could you take another look when you have time? Thank you!

pkg/neg/types/types_test.go Outdated Show resolved Hide resolved
pkg/utils/zonegetter/zone_getter.go Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
pkg/neg/types/types_test.go Outdated Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
pkg/utils/zonegetter/zone_getter.go Outdated Show resolved Hide resolved
@gauravkghildiyal
Copy link
Member

Also, is this TODO something that we plan on fixing independently? I guess as long as the MSC flags are disabled, this shouldn't cause any problems but please do confirm. Thanks!

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 5, 2024
@sawsa307
Copy link
Contributor Author

sawsa307 commented Nov 5, 2024

Also, is this TODO something that we plan on fixing independently? I guess as long as the MSC flags are disabled, this shouldn't cause any problems but please do confirm. Thanks!

Yes I'm planning to tackle those TODO in follow up PRs(like #2728), but if you would like, I can also include some of them in this PR as well.

@sawsa307 sawsa307 force-pushed the create-non-default-neg branch 2 times, most recently from 4bc66aa to 6d44fe7 Compare November 7, 2024 03:50
pkg/utils/zonegetter/zone_getter.go Outdated Show resolved Hide resolved
nodeTopologyNotSynced := z.nodeTopologyInformer != nil && !z.nodeTopologyHasSynced()
if z.onlyIncludeDefaultSubnetNodes || z.nodeTopologyInformer == nil || nodeTopologyNotSynced {
if z.nodeTopologyInformer == nil || nodeTopologyNotSynced {
logger.Info("NodeTopologyInformer is not ready, fall back to use default subnet only")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend

logger.Info("Falling back to only using default subnet", "z.onlyIncludeDefaultSubnetNodes", z.onlyIncludeDefaultSubnetNodes, <Rest of the variables influencing this decision>)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be useful to know what was the reason why we are using "default subnet only", which means it's worth logging ALL the variables which went into that decision. So maybe we don't have an if !z.nodeTopologyHasSynced() { condition for the logs, and instead log:

logger.Info("Falling back to only using default subnet when listing nodes", "z.onlyIncludeDefaultSubnetNodes", z.onlyIncludeDefaultSubnetNodes, "z.nodeTopologyHasSynced()", nodeTopologyHasSynced)

On a related note, it plan on using or logging z.nodeTopologyHasSynced(), it may be useful to store it in a variable before. Reason being that since it's a function, it may return a different value between two independent invocations -- if that ever happens for some reason, our logs would simply end up confusing us.

* Remove NodeTopologyInformer from controller context hasSynced so it
  won't block controllers from starting up in case of the CRD or CR does
  not exist.
* Query Node Topology CR for the current set of NEGs in the cluster.
* When ensureNetworkEndpointGroups(), ensure NEGs are properly
  provisioned in the non-default subnets as well.
pkg/utils/zonegetter/zone_getter.go Outdated Show resolved Hide resolved
pkg/utils/zonegetter/zone_getter.go Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
Comment on lines +137 to +138
namer negtypes.NetworkEndpointGroupNamer
l4Namer namer.L4ResourcesNamer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need both. the namer should be L4 / L7 agnostic. The right namer should be set on transactionSyncer creation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need both since they have different interface and naming schema.
For L4 NEGs, the NEG name is determined by L4Backend(), while NEG() is only for L7 NEGs. As a result, I also separated the naming schema for L4 and L7 non-default subnet NEGs as L4NonDefaultSubnetNEG and NonDefaultSubnetNEG.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, because of this complication of having two separate L4 and L7 namers, even the existing code has a bug at:

!manager.namer.IsNEG(portInfo.NegName),

where this value determines the value of customName bool. The way our new logic is, this should lead to customName always being true for L4 NEGs and thus it would end up using the L7 custom namer for L4 NEGs.

Do we want to make L4 and L7 namers implement the following?

	NonDefaultSubnetNEG(namespace, name, subnetName string, port int32) string

	NonDefaultSubnetCustomNEG(customNEGName, subnetName string) (string, error)
  • The port value will not be used for L4
  • Because we will set customNEG for L4 to always be false, the L4 NonDefaultSubnetCustomNEG should also never be used.

pkg/neg/syncers/transaction.go Outdated Show resolved Hide resolved
@sawsa307
Copy link
Contributor Author

@gauravkghildiyal @swetharepakula This PR is ready for another round of review, please take a look, thank you!

@gauravkghildiyal
Copy link
Member

@sawsa307
Copy link
Contributor Author

Github is just the worst at showing newly added comments so sharing the links:

Addressed both comments, please take a look! @gauravkghildiyal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants