-
Notifications
You must be signed in to change notification settings - Fork 784
Description
What happened:
We have an EKS cluster in AWS with self managed Linux node groups, instance type - c5a.2xlarge. We have configured warm pool (stopped state) in the Auto Scaling Group and whenever any pods are coming up which is scaling up instances from warm pool to normal pool, we are getting the below logs in the cluster events and the overall pod start time is increasing.
Warning FailedCreatePodSandBox 118s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "719d35bdbfe36a98c614414306a9596bd9fa1c8a1c5db554b2ed19a44cb54b04": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: Error received from AddNetwork gRPC call: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:50051: connect: connection refused"
Attach logs
Warning FailedCreatePodSandBox 118s kubelet Failed to create pod sandbox: rpc error: code = Unknown desc = failed to setup network for sandbox "719d35bdbfe36a98c614414306a9596bd9fa1c8a1c5db554b2ed19a44cb54b04": plugin type="aws-cni" name="aws-cni" failed (add): add cmd: Error received from AddNetwork gRPC call: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:50051: connect: connection refused"
What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
Environment:
- Kubernetes version (use
kubectl version
): - CNI Version: v1.19.2-eksbuild.5
- OS (e.g:
cat /etc/os-release
): - Kernel (e.g.
uname -a
):