VPA updater constantly fails to match the container that doesn't even exists #6215

rkashasl · 2023-10-20T14:00:47Z

Hello!
We are using latest vpa chart:

---
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  name: vertical-pod-autoscaler
  namespace: kube-system
spec:
  interval: 14m
  url: "https://cowboysysop.github.io/charts/"
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: vertical-pod-autoscaler
  namespace: kube-system
spec:
  install:
    createNamespace: true
    crds: CreateReplace
  upgrade:
    crds: CreateReplace
  releaseName: vertical-pod-autoscaler
  interval: 9m
  chart:
    spec:
      # renovate: registryUrl=https://cowboysysop.github.io/charts/
      chart: vertical-pod-autoscaler
      version: 7.2.0
      sourceRef:
        kind: HelmRepository
        name: vertical-pod-autoscaler
        namespace: kube-system
      interval: 14m
  values:
    admissionController:
      tolerations:
      - key: "arch"
        operator: "Equal"
        value: "arm64"
        effect: "NoSchedule"
      resources:
        limits:
          memory: 50Mi
        requests:
          cpu: 10m
          memory: 40Mi
    recommender:
      extraArgs:
        pod-recommendation-min-memory-mb: 30
      tolerations:
      - key: "arch"
        operator: "Equal"
        value: "arm64"
        effect: "NoSchedule"
      resources:
        limits:
          memory: 250Mi
        requests:
          cpu: 10m
          memory: 150Mi
    updater:
      tolerations:
      - key: "arch"
        operator: "Equal"
        value: "arm64"
        effect: "NoSchedule"
      resources:
        limits:
          memory: 50Mi
        requests:
          cpu: 10m
          memory: 50Mi

However we see an errors about cert-manager container that vpa-updater pod spamming:

vertical-pod-autoscaler-updater-7747d6547-qbt96 I1020 10:14:53.194314       1 capping.go:79] no matching Container found for recommendation cert-manager
vertical-pod-autoscaler-updater-7747d6547-qbt96 I1020 10:14:53.194621       1 capping.go:79] no matching Container found for recommendation cert-manager
vertical-pod-autoscaler-updater-7747d6547-qbt96 I1020 10:15:53.202479       1 capping.go:79] no matching Container found for recommendation cert-manager
vertical-pod-autoscaler-updater-7747d6547-qbt96 I1020 10:15:53.232326       1 capping.go:79] no matching Container found for recommendation cert-manager
vertical-pod-autoscaler-updater-7747d6547-qbt96 I1020 10:16:53.193146       1 capping.go:79] no matching Container found for recommendation cert-manager

Here is a cer-manager deployment and it's vpa:

---
apiVersion: source.toolkit.fluxcd.io/v1beta2
kind: HelmRepository
metadata:
  name: cert-manager
  namespace: cert-manager
spec:
  interval: 14m
  url: "https://charts.jetstack.io/"
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
  name: cert-manager
  namespace: cert-manager
spec:
  install:
    createNamespace: true
    crds: CreateReplace
  upgrade:
    crds: CreateReplace
  interval: 9m
  chart:
    spec:
      # renovate: registryUrl=https://charts.jetstack.io/
      chart: cert-manager
      version: v1.13.1
      sourceRef:
        kind: HelmRepository
        name: cert-manager
        namespace: cert-manager
      interval: 14m
  values:
    installCRDs: true
    serviceAccount:
      create: false
      name: certmanager-oidc
    global:
      priorityClassName: above-average
    prometheus:
      enabled: true
      servicemonitor:
        enabled: true
        prometheusInstance: prometheus-kube-prometheus-prometheus
    ingressShim:
      defaultIssuerName: letsencrypt-prod
      defaultIssuerKind: ClusterIssuer
    webhook:
      tolerations:
      - key: "arch"
        operator: "Equal"
        value: "arm64"
        effect: "NoSchedule"
      resources:
        limits:
          memory: 64Mi
        requests:
          memory: 32Mi
          cpu: 10m
    cainjector:
      tolerations:
      - key: "arch"
        operator: "Equal"
        value: "arm64"
        effect: "NoSchedule"
      extraArgs:
      - "--leader-elect=false"
      resources:
        limits:
          memory: 512Mi
        requests:
          memory: 128Mi
          cpu: 10m
    resources:
      limits:
        memory: 384Mi
      requests:
        memory: 160Mi
        cpu: 10m
    tolerations:
    - key: "arch"
      operator: "Equal"
      value: "arm64"
      effect: "NoSchedule"

---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: cert-manager
  namespace: cert-manager
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: cert-manager
  updatePolicy:
    updateMode: Recreate
    minReplicas: 1
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: cert-manager-cainjector
  namespace: cert-manager
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: cert-manager-cainjector
  updatePolicy:
    updateMode: Recreate
    minReplicas: 1
---
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: cert-manager-webhook
  namespace: cert-manager
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: cert-manager-webhook
  updatePolicy:
    updateMode: Recreate
    minReplicas: 1

And there is no cert-manager container name it refers to in deployment:

Name:                   cert-manager
Namespace:              cert-manager
CreationTimestamp:      Wed, 18 Oct 2023 17:14:32 +0300
Labels:                 app=cert-manager
                        app.kubernetes.io/component=controller
                        app.kubernetes.io/instance=cert-manager
                        app.kubernetes.io/managed-by=Helm
                        app.kubernetes.io/name=cert-manager
                        app.kubernetes.io/version=v1.13.1
                        helm.sh/chart=cert-manager-v1.13.1
                        helm.toolkit.fluxcd.io/name=cert-manager
                        helm.toolkit.fluxcd.io/namespace=cert-manager
Annotations:            deployment.kubernetes.io/revision: 1
                        meta.helm.sh/release-name: cert-manager
                        meta.helm.sh/release-namespace: cert-manager
Selector:               app.kubernetes.io/component=controller,app.kubernetes.io/instance=cert-manager,app.kubernetes.io/name=cert-manager
Replicas:               1 desired | 1 updated | 1 total | 1 available | 0 unavailable
StrategyType:           RollingUpdate
MinReadySeconds:        0
RollingUpdateStrategy:  25% max unavailable, 25% max surge
Pod Template:
  Labels:           app=cert-manager
                    app.kubernetes.io/component=controller
                    app.kubernetes.io/instance=cert-manager
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=cert-manager
                    app.kubernetes.io/version=v1.13.1
                    helm.sh/chart=cert-manager-v1.13.1
  Service Account:  certmanager-oidc
  Containers:
   cert-manager-controller:
    Image:       quay.io/jetstack/cert-manager-controller:v1.13.1
    Ports:       9402/TCP, 9403/TCP
    Host Ports:  0/TCP, 0/TCP
    Args:
      --v=2
      --cluster-resource-namespace=$(POD_NAMESPACE)
      --leader-election-namespace=kube-system
      --acme-http01-solver-image=quay.io/jetstack/cert-manager-acmesolver:v1.13.1
      --default-issuer-name=letsencrypt-prod
      --default-issuer-kind=ClusterIssuer
      --max-concurrent-challenges=60
    Limits:
      memory:  384Mi
    Requests:
      cpu:     10m
      memory:  160Mi
    Environment:
      POD_NAMESPACE:     (v1:metadata.namespace)
    Mounts:             <none>
  Volumes:              <none>
  Priority Class Name:  above-average
Conditions:
  Type           Status  Reason
  ----           ------  ------
  Progressing    True    NewReplicaSetAvailable
  Available      True    MinimumReplicasAvailable
OldReplicaSets:  <none>
NewReplicaSet:   cert-manager-7d47d666f8 (1/1 replicas created)
Events:          <none>

The text was updated successfully, but these errors were encountered:

universam1 · 2023-11-02T07:43:26Z

Same issue here. It looks like that VPA recommender actually is broken, no recommendations are applied to new VPAs

k8s-triage-robot · 2024-01-31T19:43:42Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-03-01T20:41:20Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

rkashasl · 2024-03-05T12:16:39Z

/remove-lifecycle rotten

k8s-triage-robot · 2024-06-03T12:41:27Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

voelzmo · 2024-06-03T14:35:20Z

Without additional information, I assume that this is another instance of #6744 which could be fixed with #6745

TL;DR: stale recommendations that don't have a matching Pod anymore can exist, when you e.g. renamed a Container in a Pod.

I'm closing this in favor of the above mentioned issue. Feel free to re-open with additional information.

/close

k8s-ci-robot · 2024-06-03T14:35:25Z

@voelzmo: Closing this issue.

In response to this:

Without additional information, I assume that this is another instance of #6744 which could be fixed with #6745

TL;DR: stale recommendations that don't have a matching Pod anymore can exist, when you e.g. renamed a Container in a Pod.

I'm closing this in favor of the above mentioned issue. Feel free to re-open with additional information.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

rkashasl added the kind/bug Categorizes issue or PR as related to a bug. label Oct 20, 2023

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 31, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 1, 2024

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 5, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 3, 2024

k8s-ci-robot closed this as completed Jun 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VPA updater constantly fails to match the container that doesn't even exists #6215

VPA updater constantly fails to match the container that doesn't even exists #6215

rkashasl commented Oct 20, 2023

universam1 commented Nov 2, 2023

k8s-triage-robot commented Jan 31, 2024

k8s-triage-robot commented Mar 1, 2024

rkashasl commented Mar 5, 2024

k8s-triage-robot commented Jun 3, 2024

voelzmo commented Jun 3, 2024

k8s-ci-robot commented Jun 3, 2024

VPA updater constantly fails to match the container that doesn't even exists #6215

VPA updater constantly fails to match the container that doesn't even exists #6215

Comments

rkashasl commented Oct 20, 2023

universam1 commented Nov 2, 2023

k8s-triage-robot commented Jan 31, 2024

k8s-triage-robot commented Mar 1, 2024

rkashasl commented Mar 5, 2024

k8s-triage-robot commented Jun 3, 2024

voelzmo commented Jun 3, 2024

k8s-ci-robot commented Jun 3, 2024