Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VPA gardener-extension-networking-calico-vpa has limit scaling active, triggers known VPA OOMkill-loop bug #339

Open
andrerun opened this issue Feb 21, 2024 · 0 comments
Labels
area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related area/networking Networking related kind/bug Bug lifecycle/stale Nobody worked on this for 6 months (will further age)

Comments

@andrerun
Copy link

How to categorize this issue?
/area networking
/area auto-scaling
/kind bug

What happened:
The VPA gardener-extension-networking-calico-vpa does not have controlledValues specified. It is thus acting per default, scaling both requests and limits. This occasionally results in excessive memory limit downscaling. In turn that triggers a known VPA bug, where VPA fails to respond to quick OOMkills, and the container get stuck in a OOMkill-restart-OOMkill loop indefinitely.

On the gardener side, the cause should be considered the combination of a memory limit, plus scaling it - that combination is known to cause the aforementioned situation, and should be avoided. The default policy, subject to component owner discretion, is: components which are critical to gardener's operation, have no memory limit. For a non-critical component, memory limit may be beneficial (or not), but it should not be scaled.

@gardener-robot gardener-robot added area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related area/networking Networking related kind/bug Bug labels Feb 21, 2024
@gardener-robot gardener-robot added the lifecycle/stale Nobody worked on this for 6 months (will further age) label Oct 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related area/networking Networking related kind/bug Bug lifecycle/stale Nobody worked on this for 6 months (will further age)
Projects
None yet
Development

No branches or pull requests

2 participants