ServiceLB not detecting removed loadbalancers #1870

ShadowJonathan · 2020-06-05T17:42:04Z

(Re-issuing of k3s-io/klipper-lb#3, because i've possibly posted that in the wrong spot after seeing what ServiceLB does)

Version:
k3s version v1.17.5+k3s1 (58ebdb2a)

K3s arguments:
k3os installation, default arguments, no tampering

k3os version v0.10.1

Describe the bug

Editing a Service resource to no longer have a LoadBalancer as it's type, or removing it, does not get detected, and daemonset pods + iptable rules will not be removed.

When a ServiceLB daemonset gets removed, the corresponding iptables aren't flushed/removed.

To Reproduce

1: Install a clean k3os instance.
2: Add a pod resource that opens a port
3: Add a LoadBalancer-typed resource
4: Confirm that daemonset pod gets created.
5: Remove pod and service
6: Observe stale daemonset and iptables

Expected behavior

For the daemonset to be removed and the iptables to be cleared of corresponding rules.

Actual behavior

Daemonset stays up despite missing container, iptables rules are stale.

The text was updated successfully, but these errors were encountered:

brandond · 2020-06-05T17:56:32Z

Are you sure your change is valid? There are a number of gotchas when changing service types, as some keys are or are not allowed to be set in combination with particular service types, and this will cause the update to be rejected. Can you confirm that cluster has actually accepted the service spec update?

ShadowJonathan · 2020-06-05T18:10:29Z

In this particular case I can remember changing LoadBalancer to ClusterIP, and then having to remove the nodePort (yes, it was in there) to allow the change.

I don't know if that particular bug or situation triggered this bug, but it was produced by using the k8s-land/gitea chart with nodePort on the service.gitea path.

rancher-max · 2020-08-27T16:05:08Z

With the exact steps of "removing the pod and service" as mentioned it is deleting the daemonset for me in my tests.
However, it appears the service is actually being edited to be ClusterIP instead of LoadBalancer?
In this case I can confirm the daemonset erroneously remains.

I was deploying the following yaml for simple testing, and then editing service type to ClusterIP and removing the nodePort value from the spec directly with kubectl edit:

kind: Service
apiVersion: v1
metadata:
  name: helloworld
spec:
  type: LoadBalancer
  selector:
    app: helloworld
  ports:
    - name: http
      protocol: TCP
      port: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: helloworld
spec:
  replicas: 1
  selector:
    matchLabels:
      app: helloworld
  template:
    metadata:
      labels:
        app: helloworld
    spec:
      containers:
        - name: helloworld
          image: gcr.io/google-samples/node-hello:1.0
          ports:
            - containerPort: 8080
               protocol: TCP

If deploying ONLY a service with type LoadBalancer, it also creates the daemonset. I believe this would be expected behavior, even though it is somewhat invalid of a scenario.
If deploying the above yaml with type set as ClusterIP originally, then a daemonset is never deployed with it.

brandond · 2020-08-27T17:21:29Z

Ah OK - so we're not picking up on the change from LoadBalancer to ClusterIP. The daemonset should probably be torn down immediately when that change occurs. It's not cleaned up at delete because it's not of type LoadBalancer and we didn't expect it to have a daemonset.

brandond · 2020-09-29T09:30:00Z

Possibly related: https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/20190423-service-lb-finalizer.md

PrivatePuffin · 2021-11-29T16:16:10Z

@k3s-io and @bradtopol This is really not acceptable behavior for a production product and is STILL an issue. after more than 1.5 years.

rancher-max · 2022-03-23T17:41:16Z

Validated using v1.23.5-rc1+k3s1

Performed the same steps as mentioned above and this is now working

$ k get all -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP           NODE              NOMINATED NODE   READINESS GATES
pod/helloworld-5f5b77ccdb-gg8gn   1/1     Running   0          62s   10.42.0.11   ip-172-31-43-96   <none>           <none>
pod/svclb-helloworld-nt2gk        1/1     Running   0          62s   10.42.0.10   ip-172-31-43-96   <none>           <none>

NAME                 TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)          AGE   SELECTOR
service/helloworld   LoadBalancer   10.43.92.62   172.31.43.96   8080:30224/TCP   62s   app=helloworld
service/kubernetes   ClusterIP      10.43.0.1     <none>         443/TCP          12m   <none>

NAME                              DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE   CONTAINERS    IMAGES                      SELECTOR
daemonset.apps/svclb-helloworld   1         1         1       1            1           <none>          62s   lb-tcp-8080   rancher/klipper-lb:v0.3.4   app=svclb-helloworld

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS   IMAGES                                 SELECTOR
deployment.apps/helloworld   1/1     1            1           62s   helloworld   gcr.io/google-samples/node-hello:1.0   app=helloworld

NAME                                    DESIRED   CURRENT   READY   AGE   CONTAINERS   IMAGES                                 SELECTOR
replicaset.apps/helloworld-5f5b77ccdb   1         1         1       62s   helloworld   gcr.io/google-samples/node-hello:1.0   app=helloworld,pod-template-hash=5f5b77ccdb

$ k edit svc helloworld
service/helloworld edited

$ k get all -o wide
NAME                              READY   STATUS    RESTARTS   AGE     IP           NODE              NOMINATED NODE   READINESS GATES
pod/helloworld-5f5b77ccdb-gg8gn   1/1     Running   0          2m34s   10.42.0.11   ip-172-31-43-96   <none>           <none>

NAME                 TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE     SELECTOR
service/helloworld   ClusterIP   10.43.92.62   <none>        8080/TCP   2m34s   app=helloworld
service/kubernetes   ClusterIP   10.43.0.1     <none>        443/TCP    13m     <none>

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE     CONTAINERS   IMAGES                                 SELECTOR
deployment.apps/helloworld   1/1     1            1           2m34s   helloworld   gcr.io/google-samples/node-hello:1.0   app=helloworld

NAME                                    DESIRED   CURRENT   READY   AGE     CONTAINERS   IMAGES                                 SELECTOR
replicaset.apps/helloworld-5f5b77ccdb   1         1         1       2m34s   helloworld   gcr.io/google-samples/node-hello:1.0   app=helloworld,pod-template-hash=5f5b77ccdb

davidnuzik assigned ShylajaDevadiga Jun 11, 2020

davidnuzik added the [zube]: To Verify label Jun 11, 2020

davidnuzik added this to the v1.19 - Backlog milestone Jun 11, 2020

davidnuzik assigned rancher-max and unassigned ShylajaDevadiga Aug 17, 2020

davidnuzik added kind/bug Something isn't working and removed [zube]: To Verify labels Aug 28, 2020

davidnuzik unassigned rancher-max Aug 28, 2020

davidnuzik added the [zube]: Backlog label Aug 28, 2020

davidnuzik modified the milestones: v1.19 - Backlog, v1.20 - Backlog Sep 15, 2020

davidnuzik removed the [zube]: Backlog label Feb 20, 2021

brandond mentioned this issue Mar 3, 2022

Support MixedProtocolLBService and clean up Daemonsets on type change. #5205

Merged

brandond modified the milestones: v1.20 - Backlog, v1.23.5+k3s1 Mar 22, 2022

brandond self-assigned this Mar 22, 2022

rancher-max self-assigned this Mar 22, 2022

rancher-max closed this as completed Mar 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ServiceLB not detecting removed loadbalancers #1870

ServiceLB not detecting removed loadbalancers #1870

ShadowJonathan commented Jun 5, 2020

brandond commented Jun 5, 2020

ShadowJonathan commented Jun 5, 2020

rancher-max commented Aug 27, 2020

brandond commented Aug 27, 2020 •

edited

Loading

brandond commented Sep 29, 2020

PrivatePuffin commented Nov 29, 2021 •

edited

Loading

rancher-max commented Mar 23, 2022

ServiceLB not detecting removed loadbalancers #1870

ServiceLB not detecting removed loadbalancers #1870

Comments

ShadowJonathan commented Jun 5, 2020

brandond commented Jun 5, 2020

ShadowJonathan commented Jun 5, 2020

rancher-max commented Aug 27, 2020

brandond commented Aug 27, 2020 • edited Loading

brandond commented Sep 29, 2020

PrivatePuffin commented Nov 29, 2021 • edited Loading

rancher-max commented Mar 23, 2022

Validated using v1.23.5-rc1+k3s1

brandond commented Aug 27, 2020 •

edited

Loading

PrivatePuffin commented Nov 29, 2021 •

edited

Loading