VPA gardener-extension-networking-calico-vpa
has limit scaling active, triggers known VPA OOMkill-loop bug
#339
Labels
area/auto-scaling
Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related
area/networking
Networking related
kind/bug
Bug
lifecycle/stale
Nobody worked on this for 6 months (will further age)
How to categorize this issue?
/area networking
/area auto-scaling
/kind bug
What happened:
The VPA gardener-extension-networking-calico-vpa does not have controlledValues specified. It is thus acting per default, scaling both requests and limits. This occasionally results in excessive memory limit downscaling. In turn that triggers a known VPA bug, where VPA fails to respond to quick OOMkills, and the container get stuck in a OOMkill-restart-OOMkill loop indefinitely.
On the gardener side, the cause should be considered the combination of a memory limit, plus scaling it - that combination is known to cause the aforementioned situation, and should be avoided. The default policy, subject to component owner discretion, is: components which are critical to gardener's operation, have no memory limit. For a non-critical component, memory limit may be beneficial (or not), but it should not be scaled.
The text was updated successfully, but these errors were encountered: