Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

install script breaks networking when using cilium with systemd >= 249 #7736

Closed
legalgig opened this issue Jun 11, 2023 · 29 comments
Closed

Comments

@legalgig
Copy link

Environmental Info:
K3s Version: v1.27.1+k3s1

Node(s) CPU architecture, OS, and Version: Ubuntu 22.04.2 LTS

Linux node 5.15.0-73-generic #80-Ubuntu SMP Mon May 15 15:18:26 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:

3 control planes
extra args "--flannel-backend=none --disable-network-policy --disable servicelb --disable traefik --disable local-storage"
CNI: cilium v1.13.3 with native routing (https://docs.cilium.io/en/v1.13/network/concepts/routing/#native-routing)

Describe the bug:

When flannel cni is disabled "service_enable_and_start()" function still tries to save and restore iptables which may break node networking for some CNI providers. Please add an argument to disable this functionality.

Steps To Reproduce:

Expected behavior:

Actual behavior:

Additional context / logs:

Workaround:
set INSTALL_K3S_SKIP_START to true and start the k3s service manually

@brandond
Copy link
Member

Can you describe how the save/restore is breaking cilium on your node?

@legalgig
Copy link
Author

I'm not sure if it breaks cilium itself but the node:

  • becomes unreachable through ssh (with error kex_exchange_identification: Connection closed by remote host)
  • dns stops resolving
  • pinging from the node surprisingly works (tested with local and external addresses)
  • restarting the k3s service doesn't bring the node back and it throws an "context deadline exceeded" error while trying to connect to anything

@brandond
Copy link
Member

brandond commented Jun 12, 2023

We call this out in the docs, see the "Cilium" tab of this section:
https://docs.k3s.io/installation/network-options#custom-cni

Before running k3s-killall.sh or k3s-uninstall.sh, you must manually remove cilium_host, cilium_net and cilium_vxlan interfaces. If you fail to do this, you may lose network connectivity to the host when K3s is stopped

Are you saying that this also happens when you re-run the install script when using Cilium?

@github-project-automation github-project-automation bot moved this from New to Done Issue in K3s Development Jun 12, 2023
@brandond brandond reopened this Jun 12, 2023
@legalgig
Copy link
Author

I've tested it with those commands and the node networking seems to be working even after executing k3s installer, but there is small problem with this approach. After uncordoning the node cilium stops routing traffic on that node because the interfaces and firewall rules are now missing, that's easily fixable by restarting the cilium pod on the said node but with my workaround this action is not required. Now to my question: is it really necessary to remove all KUBE-* and flannel-* firewall entries after every restart of the k3s service?

@brandond
Copy link
Member

@rbrtbnfgl @manuelbuil would you mind taking a look at this?

@brandond
Copy link
Member

brandond commented Jun 13, 2023

This was added to ensure that stale rules from previous configurations of kubelet, kube-proxy, kube-router, and flannel are cleaned up properly. I'm confused as to why removing these rules would affect the operation of cilium?

@rbrtbnfgl
Copy link
Contributor

Are you using this https://docs.cilium.io/en/v1.13/network/kube-router/#kube-router ? This could be the issue we are deleting kube-router rules to remove the network policy configuration done inside K3s but following this guide kube-router is used for internal routing.

@legalgig
Copy link
Author

If I understand correctly --disable-network-policy disables kube-router built into the k3s and I don't have any standalone deployment of it. This is my helm values file for cilium:

cilium:
  tunnel: "disabled"
  ipv4NativeRoutingCIDR: 10.42.0.0/16
  autoDirectNodeRoutes: true
  ipam: 
    mode: kubernetes

  hubble:
    ui:
      enabled: true
      ingress:
        enabled: true
        annotations:
          traefik.ingress.kubernetes.io/router.tls: "true"
          traefik.ingress.kubernetes.io/router.entrypoints: websecure
        hosts:
        - dummyhost
        tls:
        - secretName: dummycert
    relay:
      enabled: true

  bgp: # migrate it to new Cilium BGP Control Plane
    enabled: true
    announce:
      loadbalancerIP: true
      podCIDR: true

@rbrtbnfgl
Copy link
Contributor

rbrtbnfgl commented Jun 14, 2023

I verified with a new setup and the iptables commands are not affecting the cilium rules.
How many are the network interfaces on the node?
How did you run the first instance of K3s? Did you use the config file for the flags or did you give them through the script? When you did the update did you give the same flags to the command?
Another check that you could do is to verify with iptables-save the rules before and after the update.

@legalgig
Copy link
Author

  1. My nodes have 2 NICs 10g and 1g but only one is connected and enabled. Here is the netplan config (static IP is set through dhcp server)
network:
  ethernets:
    enp1s0:
      dhcp4: true
  version: 2
  1. I'm using ansible (my own playbook) for initial deployment of the nodes and I do not update the nodes with it (I'm using system-upgrade-controller for that). But the flags are always the same
first control plane
curl -sfL https://get.k3s.io | INSTALL_K3S_VERSION="v1.27.1+k3s1" INSTALL_K3S_EXEC="--flannel-backend=none --disable-network-policy --disable servicelb --disable traefik --disable local-storage --tls-san <keepalived ip> --cluster-init" sh -

two others
curl -sfL https://get.k3s.io | K3S_TOKEN="<token from first control plane>" INSTALL_K3S_VERSION="v1.27.1+k3s1"  INSTALL_K3S_EXEC='--flannel-backend=none --disable-network-policy --disable servicelb --disable traefik --disable local-storage --server https://<ip of the first control plane>:6443' sh -
  1. I've looked into the iptables-save/restore and it seems to break whole networking even if nothing is getting changed i.e.
this command returns nothing
iptables-save | grep -i flannel

check the count of all rules
iptables-save | wc -l
1307

so I basically apply everything as it was
iptables-save | grep -vi flannel | iptables-restore

check the count after restore
iptables-save | wc -l
1307

but now the node is gone (I executed those commands through kvm)

I've also tried to execute something like this

iptables-save | grep -v KUBE- | grep -vi cilium | iptables-restore

and it works but now I have to restart cilium pod so it can restore it's firewall configuration

@rbrtbnfgl
Copy link
Contributor

But the iptables rules should be the same. Could you check iptables version? Could you try to run K3s with the flag --prefer-bundled-bin

@legalgig
Copy link
Author

$ iptables --version
iptables v1.8.7 (nf_tables)

Sadly adding --prefer-bundled-bin to INSTALL_K3S_EXEC didn't help

@rbrtbnfgl
Copy link
Contributor

rbrtbnfgl commented Jun 16, 2023

That's strange.
Following what you did K3s with cilium you run iptables-save | iptables-restore that's because you don't have any rules that matches flannel you lost ssh connection and removing all the cilium rules the ssh connection will be up again and you have to restart cilium. I don't know if it could be related but is firewalld enabled?

@legalgig
Copy link
Author

legalgig commented Jun 16, 2023

yeah iptables-save | iptables-restore would be exactly the same thing, didn't think about it. Small correction about removing cilium rules, the node stays reachable only when I remove the cilium rules before touching iptables, when I do anything to the iptables before removing cilium rules the node will become unreachable and the only fix is to reboot it.

I've also stumbled across an bug in cilium which describes my issue. There is a snapshot which is supposed to fix it, I'll try to test it today

cilium/cilium#18706

@rbrtbnfgl
Copy link
Contributor

rbrtbnfgl commented Jun 16, 2023

Reading from a comment on the various issues it seems like a bug related to cilium and ubuntu 22.04. I am using ubuntu 20.04 that's why I wasn't able to reproduce.
Azure/AKS#3531 (comment)
I don't know if it could be your case.

@legalgig
Copy link
Author

I've updated cilium to 1.14.0-snapshot.3 and now it works! Thanks for your support!

@brandond brandond changed the title https://get.k3s.io breaks iptables configuration install script breaks networking when using cilium with systemd >= 249 Jun 16, 2023
@GyurkanM
Copy link

I upgraded cilium to 1.16.1 without kube-proxy, and "iptables-save |grep -v KUBE |iptables-restore" caused network interruption. So I found that in particular mangle table breaks the network, solution is flushing it and rollout restart of cilium so rules are added again. But if you do not have access to the console of the machine you cant do it, because SSH is broken.

@adberger
Copy link

@brandond I can confirm this is also needed on upgrades and not only when running k3s-killall.sh or k3s-uninstall.sh.
It's not needed if iptables are not installed.

Cilium Version: v1.14.10-cee.1
OS: Debian GNU/Linux 11 (bullseye)
systemd: systemd 247 (247.3-7+deb11u5)
iptables: iptables v1.8.7 (nf_tables)

Shall we update the documentation?

@brandond
Copy link
Member

Why exactly is it needed on upgrades?

@adberger
Copy link

I can't say for sure, but we had the exact same problem as described in this issue when not executing:

iptables-save | grep -iv cilium | iptables-restore
ip6tables-save | grep -iv cilium | ip6tables-restore

My guess is, that it's coming from one of these lines in the installer script:

    for XTABLES in iptables ip6tables; do
        if has_working_xtables ${XTABLES}; then
            $SUDO ${XTABLES}-save 2>/dev/null | grep -v KUBE- | grep -iv flannel | $SUDO ${XTABLES}-restore
        fi
    done

    [ "${HAS_SYSTEMD}" = true ] && systemd_start

We have updated our documentation internally.
So if this really doesn't happen for other people we can leave it as it is :)

@brandond
Copy link
Member

brandond commented Oct 11, 2024

We attempt to detect invocation of iptables-save that will not properly round-trip the rules here:

k3s/install.sh

Lines 1092 to 1096 in 430a7dc

has_working_xtables() {
if $SUDO sh -c "command -v \"$1-save\"" 1> /dev/null && $SUDO sh -c "command -v \"$1-restore\"" 1> /dev/null; then
if $SUDO $1-save 2>/dev/null | grep -q '^-A CNI-HOSTPORT-MASQ -j MASQUERADE$'; then
warn "Host $1-save/$1-restore tools are incompatible with existing rules"
else

Can you identify what specifically is being missed so that we can handle it?

Note that we don't test K3s with cilium (or any other CNI other than flannel), so it's probably not something that is encountered very often.

@adberger
Copy link

Our iptables look look like this:
root@host1:/root# iptables -S

# Warning: iptables-legacy tables present, use iptables-legacy to see them
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N CILIUM_FORWARD
-N CILIUM_INPUT
-N CILIUM_OUTPUT
-N KUBE-FIREWALL
-N KUBE-KUBELET-CANARY
-A INPUT -m comment --comment "cilium-feeder: CILIUM_INPUT" -j CILIUM_INPUT
-A INPUT -j KUBE-FIREWALL
-A FORWARD -m comment --comment "cilium-feeder: CILIUM_FORWARD" -j CILIUM_FORWARD
-A OUTPUT -m comment --comment "cilium-feeder: CILIUM_OUTPUT" -j CILIUM_OUTPUT
-A OUTPUT -j KUBE-FIREWALL
-A CILIUM_FORWARD -o cilium_host -m comment --comment "cilium: any->cluster on cilium_host forward accept" -j ACCEPT
-A CILIUM_FORWARD -i cilium_host -m comment --comment "cilium: cluster->any on cilium_host forward accept (nodeport)" -j ACCEPT
-A CILIUM_FORWARD -i lxc+ -m comment --comment "cilium: cluster->any on lxc+ forward accept" -j ACCEPT
-A CILIUM_FORWARD -i cilium_net -m comment --comment "cilium: cluster->any on cilium_net forward accept (nodeport)" -j ACCEPT
-A CILIUM_INPUT -m comment --comment "cilium: ACCEPT for proxy traffic" -j ACCEPT
-A CILIUM_OUTPUT -m comment --comment "cilium: ACCEPT for proxy traffic" -j ACCEPT
-A CILIUM_OUTPUT -m comment --comment "cilium: ACCEPT for l7 proxy upstream traffic" -j ACCEPT
-A CILIUM_OUTPUT -m comment --comment "cilium: host->any mark as from host" -j MARK --set-xmark 0xc00/0xf00
-A KUBE-FIREWALL ! -s 127.0.0.0/8 -d 127.0.0.0/8 -m comment --comment "block incoming localnet connections" -m conntrack ! --ctstate RELATED,ESTABLISHED,DNAT -j DROP

root@host1:/root# iptables-legacy -S

-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT

Does this help in any way?

@danielvincenzi
Copy link

Hey, guys @brandond @adberger! Have you found out anything about this issue? I had the same behavior today, after trying to update a K3s configuration to disable the default storage, after restarting the K3s service I lost access to the node.

O.S.: 5.10.0-33-amd64 #1 SMP Debian 5.10.226-1 (2024-10-03) x86_64
K3s version: v1.30.6+k3s1
Cilium: 1.16.3

Do you know how I can solve this? I'm unsure about the stability of the cluster.

Thank you so much!

@brandond
Copy link
Member

brandond commented Nov 5, 2024

Nope.. if someone wants to figure out what's being dropped as described at #7736 (comment) we could try to modify the script to detect it.

Also, are you sure this is the same issue? Did you for some reason re-run the install script just to change the configuration?

@danielvincenzi
Copy link

danielvincenzi commented Nov 5, 2024

Thanks @brandond for the quick response! I may have made a mistake, I ran this command to install the cluster:

curl -sfL https://get.k3s.io | sh -s - server \
 --flannel-backend=none \
 --disable-network-policy \
 --disable-kube-proxy \
 --disable traefik \
 --disable servicelb \
 --cluster-init

And the one below was to update it:

curl -sfL https://get.k3s.io | sh -s - server \
 --flannel-backend=none \
 --disable-network-policy \
 --disable-kube-proxy \
 --disable traefik \
 --disable servicelb \
 --disable local-storage \
 --cluster-init

I just added --disable local-storage \. Could that have been the problem?

@brandond
Copy link
Member

brandond commented Nov 5, 2024

No, that'd re-run the install script. You don't really need to do that just to change the config though. You can just systemctl edit k3s and modify the flags before restarting the service.

@danielvincenzi
Copy link

Great, I'll test it, thanks a lot!

@adberger
Copy link

adberger commented Nov 5, 2024

Our solution was to remove the iptables binary from the host, since cilium manages the iptables via the Pod and doesn't need the binary on the host.

When iptables is not found when running the k3s script, networking will not break.

@danielvincenzi
Copy link

Our solution was to remove the iptables binary from the host, since cilium manages the iptables via the Pod and doesn't need the binary on the host.

When iptables is not found when running the k3s script, networking will not break.

Thanks, @adberger! I will try it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

6 participants