-
Notifications
You must be signed in to change notification settings - Fork 784
Enhanced Subnet Discovery #3380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
96310d6
to
0a64e10
Compare
pkg/ipamd/ipamd.go
Outdated
// Mark primary ENI as excluded if primary subnet is excluded | ||
if eni == primaryENI && c.isPrimarySubnetExcluded { | ||
log.Infof("Marking primary ENI %s as excluded from pod IP allocation", eni) | ||
err := c.dataStoreAccess.GetDataStore(eniMetadata.NetworkCard).SetENIExcludedForPodIPs(eni, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check if eniMetadata.NetworkCard
is the same network card as 0.
f1898c7
to
7945b5a
Compare
name: "IPv6 enabled with subnet discovery and mixed IPv4/IPv6 subnets", | ||
subnets: []ec2types.Subnet{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to check edge case in code if customer is on IPv4 cluster and the code picks up an IPv6 subnet because that's all they had tagged in the console and vice versa.
903a226
to
0368ae0
Compare
pkg/awsutils/awsutils.go
Outdated
// When subnet discovery is disabled, check if primary subnet is excluded | ||
excluded, checkErr := cache.isPrimarySubnetExcluded() | ||
if checkErr != nil { | ||
// If we can't determine exclusion status, log warning and proceed | ||
log.Warnf("Failed to check if primary subnet is excluded: %v. Proceeding with ENI creation attempt.", checkErr) | ||
} else if excluded { | ||
// Primary subnet is explicitly excluded | ||
err = errors.New("primary subnet is tagged with kubernetes.io/role/cni=0 and subnet discovery is disabled - no valid subnets available for ENI creation") | ||
log.Error(err.Error()) | ||
return "", err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This checks if Cx has a tagged subnet, but have not enable subnet discovery. Do we want to do this check?
pkg/ipamd/ipamd.go
Outdated
if c.enableIPv6 { | ||
assignedIPs = primaryENIInfo.AssignedIPv6Addresses() | ||
ipType = "IPv6" | ||
} else { | ||
assignedIPs = primaryENIInfo.AssignedIPv4Addresses() | ||
ipType = "IPv4" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See if we can use hasPods()
function.
Currently hasPods()
only checks for IPv4 assigned addresses, will need to update that func to support IPv6.
// Skip ENIs that are excluded from pod IP allocation | ||
if eni.IsExcludedForPodIPs { | ||
ds.log.Debugf("Skip needs IP check for ENI %s as it is excluded from pod IP allocation", eni.ID) | ||
continue | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will skip the primary ENI in GetAllocatableENIs
function and prevent it from getting IPs assigned/allocated
0368ae0
to
ea0763f
Compare
pkg/ipamd/ipamd_test.go
Outdated
assert.False(t, wasUsed, "Primary ENI should be excluded and auto-cleanup prefixes") | ||
} | ||
|
||
func TestPrimaryENIAutoCleanupFailureHandling(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add case for IPv6 mode too
587b9f3
to
ea0763f
Compare
describeSGInput := &ec2.DescribeSecurityGroupsInput{ | ||
Filters: []ec2types.Filter{ | ||
{ | ||
Name: aws.String("vpc-id"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need support global SGs which could be in another VPCs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe we support VPC peering in CNI. I could not find any existing logic that I could leverage.
I think keeping it local VPC only might keep things simpler, and we could do VPC peering as a follow up.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
global SGs is not VPC peering. EC2 support using SG from another VPC.
pkg/awsutils/awsutils.go
Outdated
}, | ||
{ | ||
Name: aws.String("tag:" + subnetDiscoveryTagKey), | ||
Values: []string{"1"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: let's make them to const, such as selected = 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
eniIDs = append(eniIDs, cache.getFilteredENIs(ctx, ds, true)...) // only secondary subnet ENIs | ||
} | ||
|
||
cache.applySecurityGroupsToENIs(ctx, eniIDs, primarySGs, "Applying primary security groups to") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should have error to handle.
return err | ||
} | ||
|
||
addedCount, deletedCount := cache.detectSecurityGroupChanges(sgIDs, &cache.securityGroups, "primary") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same arbitary argument being passed through.
primarySubnetENIs := cache.getFilteredENIs(ctx, ds, false) | ||
for _, eniID := range primarySubnetENIs { | ||
// Filter out unmanaged ENIs | ||
if !cache.unmanagedENIs.Has(eniID) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I feel unmanaged ENIs should be handled in cache.getFilteredENIs()
.
} | ||
|
||
// Apply security groups to the filtered ENIs | ||
cache.applySecurityGroupsToENIs(ctx, eniIDs, sgIDs, "Update") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
log.Warnf("Failed to check if primary subnet is excluded: %v. Proceeding with ENI creation attempt.", checkErr) | ||
} else if excluded { | ||
// Primary subnet is explicitly excluded | ||
return "", fmt.Errorf("primary subnet is tagged with kubernetes.io/role/cni=0 - no valid subnets available for ENI creation") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest surface this error to customers.
if vpcErr != nil { | ||
log.Warnf("Failed to call ec2:DescribeSubnets: %v", vpcErr) | ||
log.Info("Defaulting to same subnet as the primary interface for the new ENI") | ||
|
||
// Even in fallback, check if primary subnet is excluded | ||
excluded, checkErr := cache.IsPrimarySubnetExcluded(ctx) | ||
excluded, checkErr := cache.IsSubnetExcluded(ctx, cache.subnetID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use primarySubnetID = cache.subnetID
9036276
to
a188827
Compare
f0bc83a
to
1d749c9
Compare
8baff5a
to
34be861
Compare
What type of PR is this?
feature
Which issue does this PR fix?:
#3067
#2904
What does this PR do / Why do we need it?:
This PR adds a requested customer feature to allow customers to
kubernetes.io/role/cni=1
orkubernetes.io/role/cni=0
respectively.kubernetes.io/role/cni=1
tagged subnet if the alternate security group is also taggedkubernetes.io/role/cni=1
.kubernetes.io/cluster/<my-example-cluster>
tag. See Enhanced subnet discovery should use configurable tags #2904 more.Testing done on this change:
Will this PR introduce any new dependencies?:
Will this break upgrades or downgrades? Has updating a running cluster been tested?:
Does this change require updates to the CNI daemonset config files to work?:
ENABLE_SUBNET_DISCOVERY
enabled which is already done by default.Does this PR introduce any user-facing change?:
Users were already tagging their subnets as the part of the original subnet discovery feature. This enhanced feature will also allow users to tag alternate security groups.
Make sure that IPv6 documentation and managed IPv4 policy are updated.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Integration tests passing in IPv4 cluster.
Currently adapting our test framework to work with IPv6 cluster. Will paste results here.