Skip to content

Conversation

kishorj
Copy link
Contributor

@kishorj kishorj commented Feb 18, 2021

Fixes: #1744

Overview

The support for NLB instance mode is available from the k8s service controller, and any changes or improvements are tied to the upstream release schedule. In addition, there are also limitations on using CRDs in the in-tree controller.

This PR is for adding support for NLB instance mode in the AWS Load balancer controller. The controller already supports instance mode for ALB, and we will build up on the TargetGroupBinding feature for NLB instance mode as well.

Moving forward, the k8s AWS service controller will have support for NLB instance mode and CLB with basic features only. For more NLB features like the TargetGroupBinding, proxy v2 support, improved health checks, IP target mode we recommend running the AWS load balancer controller. Each service of type LoadBalancer on AWS k8s is reconciled either by the in-tree controller or the external controller based on the annotations provided the version requirements are met.

k8s version requirement

The instance mode support for service type LoadBalancer requires k8s 1.20 (non-EKS) or later where we've modified the in-tree controller to ignore service with aws-load-balancer-type annotation of external or nlb-ip. For EKS, we've backported the changes to EKS 1.16 and later releases.

How to configure load balancer target type?

If the service annotation service.beta.kubernetes.io/aws-load-balancer-type is external, this controller will look for an additional annotation service.beta.kubernetes.io/aws-load-balancer-target-type to determine the target type. The current supported values for the target types annotation are nlb-instance and nlb-ip.

Instance target

Example service spec for instance target

kind: Service
apiVersion: v1
metadata:
  name: nlb-svc
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-type: "external"
    service.beta.kubernetes.io/aws-load-balancer-target-type: "nlb-instance"
spec:
  ports:
    - name: port
      port: 80
      targetPort: 80
      protocol: TCP
  type: LoadBalancer
  selector:
    app: instance-mode

IP target

For consistency with instance target mode, IP mode will also be determined based on the following annotations -

service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-target-type: nlb-ip

For backwards compatibility, the service.beta.kubernetes.io/aws-load-balancer-type annotation value of nlb-ip will still be interpreted as IP mode NLB.

Alternative considered

Modify the in-tree controller to support additional value for the service.beta.kubernetes.io/aws-load-balancer-type annotation, for example

service.beta.kubernetes.io/aws-load-balancer-type: nlb-instance

Pros:

  • Single annotation, and is similar to nlb-ip

Cons:

  • We have to backport the change to all supported EKS versions, and will be dependent on the patch rollout. This results in different behavior across multiple platform versions and will be more confusing to the end users
  • One more patch to the in-tree controller we need to maintain.

Testing

  • Create service of type Load Balancer with instance targets, External Traffic Policy set to Cluster
  • Create service of type Load Balancer with instance targets, External Traffic Policy set to Local
  • health check annotations get honored
  • proxy protocol v2
  • preserve client IP can be disabled for instance mode
  • Security group rules get added as expected
  • The labels specified inservice.beta.kubernetes.io/aws-load-balancer-target-node-labels annotation works as expected
    • Without this annotation, the default filter gets applied
    • When service is created/edited with this annotaton, only matching nodes get registered
    • On edit, any nodes not matching the labels get de-registered from the target group
    • Node label edits are correctly reconciled

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Feb 18, 2021
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Feb 18, 2021
@kishorj kishorj marked this pull request as draft February 18, 2021 06:38
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 18, 2021
@kishorj kishorj changed the title [draft] add support for NLB instance mode [wip] add support for NLB instance mode Feb 18, 2021
@kishorj kishorj changed the title [wip] add support for NLB instance mode [WIP] add support for NLB instance mode Feb 19, 2021
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 19, 2021
@kishorj
Copy link
Contributor Author

kishorj commented Feb 19, 2021

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Feb 19, 2021
@kishorj kishorj changed the title [WIP] add support for NLB instance mode Add support for NLB instance mode Feb 20, 2021
@kishorj kishorj marked this pull request as ready for review February 20, 2021 01:13
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 20, 2021
@codecov-io
Copy link

codecov-io commented Feb 25, 2021

Codecov Report

Merging #1832 (4729f3a) into main (88b90c0) will increase coverage by 0.35%.
The diff coverage is 82.82%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1832      +/-   ##
==========================================
+ Coverage   47.62%   47.98%   +0.35%     
==========================================
  Files         110      110              
  Lines        6110     6177      +67     
==========================================
+ Hits         2910     2964      +54     
- Misses       2931     2938       +7     
- Partials      269      275       +6     
Impacted Files Coverage Δ
pkg/deploy/elbv2/target_group_binding_manager.go 0.00% <0.00%> (ø)
pkg/model/elbv2/target_group_binding.go 0.00% <ø> (ø)
pkg/service/model_build_target_group.go 82.31% <82.41%> (-0.67%) ⬇️
pkg/service/model_builder.go 81.48% <100.00%> (+2.75%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 88b90c0...5574a55. Read the comment docs.

@TBBle
Copy link
Contributor

TBBle commented Feb 25, 2021

Does this implementation also suffer from kubernetes/cloud-provider-aws#87 with externalTrafficPolicy: Local? And if so, would it make sense to (later) have service.beta.kubernetes.io/aws-load-balancer-target-type: nlb-instance and externalTrafficPolicy: Local together trigger a different registration mode which registers only instances with local Endpoints for the given Service? I don't believe this was possible via the Cloud Provider APIs (there was no notification I could see for Endpoint changes), but with the distinct Load Balancer Controller, I think this could be feasible. It perhaps presents scaling concerns due to having to react to Endpoint create/delete instead of just Node create/delete.

Alternatively, just documenting that if you want externalTrafficPolicy: Local, you're better off with service.beta.kubernetes.io/aws-load-balancer-target-type: nlb-ip to avoid this "known issue" would be fine.

Asking due to #1842. It's not directly related due to as it doesn't actually mention externalTrafficPolicy: Local, but it prompted me to wonder this.

@M00nF1sh
Copy link
Collaborator

M00nF1sh commented Feb 26, 2021

@TBBle Yes, this initial implementation will still suffer from that NLB issue as well. (NLB team is working on changes on their end to avoid initial targets receive traffic).

We might do follow up PRs to optimize the issue in the way as you described. However, it will suffer from the long NLB registration delay(another issue NLB team is working on fixes) you saw in NLB-IP mode, e.g. pod relocates to another Node.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 26, 2021
@kishorj kishorj added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Feb 26, 2021
Copy link
Collaborator

@M00nF1sh M00nF1sh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 2, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kishorj, M00nF1sh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 04c9bf5 into kubernetes-sigs:main Mar 2, 2021
@kishorj kishorj deleted the nlb-instance branch March 3, 2021 00:58
Timothy-Dougherty pushed a commit to adammw/aws-load-balancer-controller that referenced this pull request Nov 9, 2023
* add support for NLB instance mode

* decouple health check default values from builder task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nlb instance support
5 participants