Pods are going into pending state after upgrading from v1.26.12-k3s1 to v1.27.11-k3s1 and v1.28.5-k3s1 (Issue is quite random) #10044
-
Environmental Info: Node(s) CPU architecture, OS, and Version: Cluster Configuration: Describe the bug:
There is one scenario that is quite weird that I am facing currently. Say I am running 3 replicas of nginx-deployment and when the issue occurs I tried to scale the replicas to 5 the newer 2 pods go into pending state. Also at the same time I deleted one of the older nginx pod and described the svc, in svc endpoints it was still showing the older 3 pod IPs.
Node Conditions
Steps To Reproduce: Expected behavior: Actual behavior: |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 5 replies
-
Why are they pending? If you describe the pod and/or check the kubelet logs on the node it should tell you why. Just saying that they are pending doesn't really provide enough information to work off of. |
Beta Was this translation helpful? Give feedback.
-
So you have a single server node? I don't see it listed in your nodes list, are you running with --disable-agent or did you just not show it? Have you checked the server logs to see if there are any errors from the scheduler or controller-manager? If you are running a single server with sqlite there is no leader election, so the components should always be running and active on the server. |
Beta Was this translation helpful? Give feedback.
-
@brandond I guess, the issue was related to Kubernetes version 1.27 caused issues with watching events for certain clients. The change introduced in v1.27 allowed certain watches to directly access etcd/kine and were completely unfiltered, potentially starving all other watches and leading to the observed problems. Issue: kubernetes/kubernetes#123448 I have tried the k3s versions |
Beta Was this translation helpful? Give feedback.
@brandond I guess, the issue was related to Kubernetes version 1.27 caused issues with watching events for certain clients. The change introduced in v1.27 allowed certain watches to directly access etcd/kine and were completely unfiltered, potentially starving all other watches and leading to the observed problems.
Issue: kubernetes/kubernetes#123448
Fix: kubernetes/kubernetes#123532
I have tried the k3s versions
1.27.13-k3s1
and1.28.9-k3s1
, I haven't faced this issue yet. Still monitoring though, but I feel this was the issue and the fix resolves it.