Skip to content

Kube Adaptor Restart on Lost Lease #2291

@c-kruse

Description

@c-kruse

Describe the bug
Within a Site, only one kube-adaptor container should be considered the leader at a time. This includes situations where multiple router deployments are running due to HA, as well as during momentary rollout scale up/down events. To achieve this, the kube-adaptor exercises the kubernetes leases API.

Presently when the current leader loses the Lease, usually due to the lease API availability, the container exits 1. The kube-adaptor is then restarted in the Pod (pod .spec.restartPolicy=Always), sometimes with a CrashBackOff when the issue is persistent.

While the router should continue to operate as configured gracefully without a running adaptor, this ends up contributing to larger network instability in a few ways.

  • Readiness: the Pod readiness check depends on a running kube-adaptor container. When the Pod is marked as Not Ready it is removed from the kube EndpointSlices so that Service traffic is not sent to the router.
  • Configuration drift: While the kube-adaptor is not running, it is not syncing desired configuration to the router.

Originally reported here: #2250

How To Reproduce
Steps to reproduce the behavior:
⚠️ Applying this configuration will indiscriminately limit kube api lease operations and should be considered harmful to overall cluster health. ⚠️

---
apiVersion: flowcontrol.apiserver.k8s.io/v1
kind: PriorityLevelConfiguration
metadata:
  name: gh2291-leader-election
spec:
  limited:
    lendablePercent: 0
    limitResponse:
      type: Reject
    nominalConcurrencyShares: 0
    borrowingLimitPercent: 0
  type: Limited
---
apiVersion: flowcontrol.apiserver.k8s.io/v1
kind: FlowSchema
metadata:
  name: gh2291-leader-election
spec:
  distinguisherMethod:
    type: ByUser
  matchingPrecedence: 110
  priorityLevelConfiguration:
    name: gh2291-leader-election
  rules:
    - resourceRules:
        - apiGroups:
            - coordination.k8s.io
          resources:
            - leases
          namespaces:
            - '*'
          verbs:
            - get
            - watch
            - create
            - update
      subjects:
        - kind: Group
          group:
            name: system:serviceaccounts

  • Apply the above API Priority configurations to severely limit api server concurrency on the leases apis.
  • deploy enough skupper Sites depending on your cluster until kube-adaptors begin to crash. Local with kind (single node control plane and etcd) it took ~15ish sites.

Alternatively, fight the skupper controller to update the skupper-router role: removing create,update,delete verbs from the leases rule.

Expected behavior

When the leader election is lost:

  • do not exit
  • stop the leader processes (site flow controller + status sync)
  • log the error
  • retry leader election with backoff

Environment details

  • Skupper Operator: 2.0+
  • Platform: kubernetes

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions