Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When upgrading to a new LWS version, rolling update gets triggered #281

Closed
Edwinhr716 opened this issue Dec 13, 2024 · 2 comments · Fixed by #277
Closed

When upgrading to a new LWS version, rolling update gets triggered #281

Edwinhr716 opened this issue Dec 13, 2024 · 2 comments · Fixed by #277
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@Edwinhr716
Copy link
Contributor

What happened:
When updating from release-0.2.0 to release-0.4.2, the template hash label in the pods and statefulsets changes, despite the LWS object not changing. This triggers a rolling update, even though it shouldn't.

What you expected to happen:
Update to happen without triggering rolling updates

How to reproduce it (as minimally and precisely as possible):

  1. Deploy the lws controller with version 0.2
  2. Deploy a yaml with a volumeMount
apiVersion: leaderworkerset.x-k8s.io/v1
kind: LeaderWorkerSet
metadata:
  labels:
    app.kubernetes.io/name: leaderworkerset
    app.kubernetes.io/instance: leaderworkerset-multi-template
    app.kubernetes.io/part-of: lws
    app.kubernetes.io/managed-by: kustomize
    app.kubernetes.io/created-by: lws
  name: cube
spec:
  replicas: 2
  leaderWorkerTemplate:
    size: 4
    workerTemplate:
      spec:
        containers:
        - name: nginx
          image: nginx:1.14.1
          resources:
            limits:
              cpu: "100m"
            requests:
              cpu: "50m"
          ports:
          - containerPort: 8080
          volumeMounts:
          - name: dshm
            mountPath: /dev/shm
        volumes:
        - name: dshm
          emptyDir:
            medium: Memory

  1. Upgrade the lws controller to 0.4.2

Anything else we need to know?:
The template hash is generated using the string version of the two pod templates. When new fields are added to other Kubernetes objects that can be part of the pod template spec, that will change the underlying string, even the new fields are not set.

You can see this when comparing the podTemplateSpec used to generate the template hash in version 0.2 versus the one used on version 0.4.2

Version to 0.2

[] Volume {
    Volume {
        Name: dshm,
        VolumeSource: VolumeSource {
            HostPath: nil,
            EmptyDir: & EmptyDirVolumeSource {
                Medium: Memory,
                SizeLimit: < nil > ,
            },
            GCEPersistentDisk: nil,
            AWSElasticBlockStore: nil,
            GitRepo: nil,
            Secret: nil,
            NFS: nil,
            ISCSI: nil,
            Glusterfs: nil,
            PersistentVolumeClaim: nil,
            RBD: nil,
            FlexVolume: nil,
            Cinder: nil,
            CephFS: nil,
            Flocker: nil,
            DownwardAPI: nil,
            FC: nil,
            AzureFile: nil,
            ConfigMap: nil,
            VsphereVolume: nil,
            Quobyte: nil,
            AzureDisk: nil,
            PhotonPersistentDisk: nil,
            PortworxVolume: nil,
            ScaleIO: nil,
            Projected: nil,
            StorageOS: nil,
            CSI: nil,
            Ephemeral: nil,
        },
    },
}

Version 0.4.2

[] Volume {
    Volume {
        Name: dshm,
        VolumeSource: VolumeSource {
            HostPath: nil,
            EmptyDir: & EmptyDirVolumeSource {
                Medium: Memory,
                SizeLimit: < nil > ,
            },
            GCEPersistentDisk: nil,
            AWSElasticBlockStore: nil,
            GitRepo: nil,
            Secret: nil,
            NFS: nil,
            ISCSI: nil,
            Glusterfs: nil,
            PersistentVolumeClaim: nil,
            RBD: nil,
            FlexVolume: nil,
            Cinder: nil,
            CephFS: nil,
            Flocker: nil,
            DownwardAPI: nil,
            FC: nil,
            AzureFile: nil,
            ConfigMap: nil,
            VsphereVolume: nil,
            Quobyte: nil,
            AzureDisk: nil,
            PhotonPersistentDisk: nil,
            PortworxVolume: nil,
            ScaleIO: nil,
            Projected: nil,
            StorageOS: nil,
            CSI: nil,
            Ephemeral: nil,
            Image: nil,
        },
    },
}

The latter contains the field Image: nil, which was added in v0.31.0 of k8s.io/apimachinery https://pkg.go.dev/k8s.io/api/core/v1#ImageVolumeSource. 0.2 uses version v0.29.3, while 0.4.2 uses version v.0.31.0

Environment:

  • Kubernetes version (use kubectl version):
  • LWS version (use git describe --tags --dirty --always):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
@Edwinhr716 Edwinhr716 added the kind/bug Categorizes issue or PR as related to a bug. label Dec 13, 2024
@Edwinhr716
Copy link
Contributor Author

/assign @Edwinhr716

@ahg-g
Copy link
Contributor

ahg-g commented Dec 16, 2024

The fix is to use controller revision to track and detect changes to the spec #239

@ahg-g ahg-g closed this as completed Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants