Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ServiceLB not detecting removed loadbalancers #1870

Closed
ShadowJonathan opened this issue Jun 5, 2020 · 7 comments
Closed

ServiceLB not detecting removed loadbalancers #1870

ShadowJonathan opened this issue Jun 5, 2020 · 7 comments
Assignees
Labels
kind/bug Something isn't working
Milestone

Comments

@ShadowJonathan
Copy link

(Re-issuing of k3s-io/klipper-lb#3, because i've possibly posted that in the wrong spot after seeing what ServiceLB does)

Version:
k3s version v1.17.5+k3s1 (58ebdb2a)

K3s arguments:
k3os installation, default arguments, no tampering

k3os version v0.10.1

Describe the bug

Editing a Service resource to no longer have a LoadBalancer as it's type, or removing it, does not get detected, and daemonset pods + iptable rules will not be removed.

When a ServiceLB daemonset gets removed, the corresponding iptables aren't flushed/removed.

To Reproduce

1: Install a clean k3os instance.
2: Add a pod resource that opens a port
3: Add a LoadBalancer-typed resource
4: Confirm that daemonset pod gets created.
5: Remove pod and service
6: Observe stale daemonset and iptables

Expected behavior

For the daemonset to be removed and the iptables to be cleared of corresponding rules.

Actual behavior

Daemonset stays up despite missing container, iptables rules are stale.

@brandond
Copy link
Member

brandond commented Jun 5, 2020

Are you sure your change is valid? There are a number of gotchas when changing service types, as some keys are or are not allowed to be set in combination with particular service types, and this will cause the update to be rejected. Can you confirm that cluster has actually accepted the service spec update?

@ShadowJonathan
Copy link
Author

In this particular case I can remember changing LoadBalancer to ClusterIP, and then having to remove the nodePort (yes, it was in there) to allow the change.

I don't know if that particular bug or situation triggered this bug, but it was produced by using the k8s-land/gitea chart with nodePort on the service.gitea path.

@rancher-max
Copy link
Contributor

  • With the exact steps of "removing the pod and service" as mentioned it is deleting the daemonset for me in my tests.
  • However, it appears the service is actually being edited to be ClusterIP instead of LoadBalancer?
    In this case I can confirm the daemonset erroneously remains.

I was deploying the following yaml for simple testing, and then editing service type to ClusterIP and removing the nodePort value from the spec directly with kubectl edit:

kind: Service
apiVersion: v1
metadata:
  name: helloworld
spec:
  type: LoadBalancer
  selector:
    app: helloworld
  ports:
    - name: http
      protocol: TCP
      port: 8080
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: helloworld
spec:
  replicas: 1
  selector:
    matchLabels:
      app: helloworld
  template:
    metadata:
      labels:
        app: helloworld
    spec:
      containers:
        - name: helloworld
          image: gcr.io/google-samples/node-hello:1.0
          ports:
            - containerPort: 8080
               protocol: TCP
  • If deploying ONLY a service with type LoadBalancer, it also creates the daemonset. I believe this would be expected behavior, even though it is somewhat invalid of a scenario.
  • If deploying the above yaml with type set as ClusterIP originally, then a daemonset is never deployed with it.

@brandond
Copy link
Member

brandond commented Aug 27, 2020

Ah OK - so we're not picking up on the change from LoadBalancer to ClusterIP. The daemonset should probably be torn down immediately when that change occurs. It's not cleaned up at delete because it's not of type LoadBalancer and we didn't expect it to have a daemonset.

@brandond
Copy link
Member

@PrivatePuffin
Copy link

PrivatePuffin commented Nov 29, 2021

@k3s-io and @bradtopol This is really not acceptable behavior for a production product and is STILL an issue. after more than 1.5 years.

@rancher-max
Copy link
Contributor

Validated using v1.23.5-rc1+k3s1

Performed the same steps as mentioned above and this is now working

$ k get all -o wide
NAME                              READY   STATUS    RESTARTS   AGE   IP           NODE              NOMINATED NODE   READINESS GATES
pod/helloworld-5f5b77ccdb-gg8gn   1/1     Running   0          62s   10.42.0.11   ip-172-31-43-96   <none>           <none>
pod/svclb-helloworld-nt2gk        1/1     Running   0          62s   10.42.0.10   ip-172-31-43-96   <none>           <none>

NAME                 TYPE           CLUSTER-IP    EXTERNAL-IP    PORT(S)          AGE   SELECTOR
service/helloworld   LoadBalancer   10.43.92.62   172.31.43.96   8080:30224/TCP   62s   app=helloworld
service/kubernetes   ClusterIP      10.43.0.1     <none>         443/TCP          12m   <none>

NAME                              DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE   CONTAINERS    IMAGES                      SELECTOR
daemonset.apps/svclb-helloworld   1         1         1       1            1           <none>          62s   lb-tcp-8080   rancher/klipper-lb:v0.3.4   app=svclb-helloworld

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE   CONTAINERS   IMAGES                                 SELECTOR
deployment.apps/helloworld   1/1     1            1           62s   helloworld   gcr.io/google-samples/node-hello:1.0   app=helloworld

NAME                                    DESIRED   CURRENT   READY   AGE   CONTAINERS   IMAGES                                 SELECTOR
replicaset.apps/helloworld-5f5b77ccdb   1         1         1       62s   helloworld   gcr.io/google-samples/node-hello:1.0   app=helloworld,pod-template-hash=5f5b77ccdb

$ k edit svc helloworld
service/helloworld edited

$ k get all -o wide
NAME                              READY   STATUS    RESTARTS   AGE     IP           NODE              NOMINATED NODE   READINESS GATES
pod/helloworld-5f5b77ccdb-gg8gn   1/1     Running   0          2m34s   10.42.0.11   ip-172-31-43-96   <none>           <none>

NAME                 TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)    AGE     SELECTOR
service/helloworld   ClusterIP   10.43.92.62   <none>        8080/TCP   2m34s   app=helloworld
service/kubernetes   ClusterIP   10.43.0.1     <none>        443/TCP    13m     <none>

NAME                         READY   UP-TO-DATE   AVAILABLE   AGE     CONTAINERS   IMAGES                                 SELECTOR
deployment.apps/helloworld   1/1     1            1           2m34s   helloworld   gcr.io/google-samples/node-hello:1.0   app=helloworld

NAME                                    DESIRED   CURRENT   READY   AGE     CONTAINERS   IMAGES                                 SELECTOR
replicaset.apps/helloworld-5f5b77ccdb   1         1         1       2m34s   helloworld   gcr.io/google-samples/node-hello:1.0   app=helloworld,pod-template-hash=5f5b77ccdb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants