-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support IP configuration for multicard instances #2031
base: main
Are you sure you want to change the base?
Conversation
@M00nF1sh can you take a look at this? |
LGTM |
@@ -89,11 +96,79 @@ func (a *networkingAspect) ensureEKSNetworkConfiguration(cfg *api.NodeConfig) er | |||
return nil | |||
} | |||
|
|||
func (a *networkingAspect) ensureMulticardNetworkConfiguration(cfg *api.NodeConfig) error { | |||
var networkRestartRequired bool | |||
routeTableId := 1001 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how is this routeTableId chosen to be starts with 1001 for those multi-card ENIs? is it chosen to align with our AL2? Maybe better to align with AL2023's default behavior if there is no specific reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose it to align with AL2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm ok with the 1000 route table thing, but prefer to align with AL2023's default behavior whenever possible. e.g. what's the route table routes for those enis set to if launch on a normal AL2023 instead of EKS's ones. i have no idea why 1000 were chosen for EKS Al2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of have a hacky "1000" route table, unless there are reasons. the major concern i have is this hard-coded 1000 might conflicts with other products who uses secondary route tables(vpc-cni for example). So the closer it aligns with AL2023's default behavior, the less surprise to customers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed offline, the reason for having the routes in separate route tables was because I was aligning with what we have in AL2. However nowadays CNI actually skips non-zero cards so we'll go with the approach of adding the routes for non-zero cards in the main routing table.
networkRestartRequired = true | ||
} | ||
|
||
if networkRestartRequired { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: maybe we should combine the "reloadNetworkConfigurations" with ensureEKSNetworkConfiguration
to do reload once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can restart the network in the Setup function, after we configure both primary and multicard interfaces.
continue | ||
} | ||
|
||
networkInterfaceConfName := fmt.Sprintf("80-card%d.network", card.CardIndex) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, it should be possible to have multiple ENIs per network card(with different deviceIndex), have we tested this behavior?
seems it won't work if you have multiple ENI on a network card
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point. The problem we face is just the name of the file right? I can add the deviceIndex
in the name: 80-card%d-%d.network
, or we can use the mac address too. Do you have any preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd like to avoid couple to cardIndex&deviceIndex(it's complicated, see https://github.com/amazonlinux/amazon-ec2-net-utils/blob/e01f53f278eeb13bbdc856da921584944c825286/lib/lib.sh#L344)
I prefer to use 70-<eni-xxxx>.network
where eni-xxxx
is the eni id, you can get it from ec2Metadata
Issue #, if available:
Description of changes:
This PR is a followup from a previous one, where we decided to change the approach and let
nodeadm
generate.network
files that will be handled bysystemd-networkd
service.In this PR,
nodeadm
will create/etc/systemd/network/70-eni-${eni_id}.network
files for each non 0 indexed card. It will skip cards that are 0 indexed or non 0 cards that don't have IP configured from EC2.By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Testing Done
I tested pod-pod networking and also utilizing the non 0 indexed interfaced from pods that were running in the host network namespace. Here are some test results:
Route tables:
See this guide for recommended testing for PRs. Some tests may not apply. Completing tests and providing additional validation steps are not required, but it is recommended and may reduce review time and time to merge.