Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flannel not respecting (--flannel-external-ip or --flannel-iface) when interface is not on default route #11476

Closed
cepelinas9000 opened this issue Dec 18, 2024 · 1 comment

Comments

@cepelinas9000
Copy link

Environmental Info:
K3s Version: 3s version v1.31.3+k3s1 (6e6af98)

Node(s) CPU architecture, OS, and Version:

There are two nodes with similar configuration

$ uname -a
Linux pc 6.10.1-vanilla #1 SMP PREEMPT_DYNAMIC Thu Jul 25 12:47:49 EEST 2024 x86_64 GNU/Linux

Cluster Configuration:

There are 2 nodes with following configuration (most important bits shown)

node 1 - server node:
eth0 192.168.66.200 (with default route)
wg0 172.30.0.1 (vpn to node2)
node 2 - just agent:
eth2 x.x.x.x (redacted, default route)
wg0 172.30.0.2 (vpn to node1)

Describe the bug:

The problem is that starting node1 with following parameters

./k3s server -i 172.30.0.1 --node-external-ip 172.30.0.1 --flannel-backend=vxlan --flannel-external-ip 172.30.0.1 --flannel-iface wg0 --bind-address 172.30.0.1

and node2

./k3s agent -t xxxtokenxxx --server https://172.30.0.1:6443 -i 172.30.0.2 --node-external-ip 172.30.0.2  --flannel-iface wg0 --bind-address 172.30.0.2 

the flannel tunnel from node1 uses wrong src ip. For example doing ping from node2 to node1 the following packets are produced:

node1  # tcpdump -i wg0 -n udp port 8472
dropped privs to pcap
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on wg0, link-type RAW (Raw IP), snapshot length 262144 bytes
16:58:21.074052 IP 172.30.0.2.40978 > 172.30.0.1.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.0 > 10.42.0.1: ICMP echo request, id 44022, seq 3, length 64
16:58:21.074128 IP 192.168.66.200.55307 (!!! need to be 172.30.0.1!!! ) > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.1 > 10.42.1.0: ICMP echo reply, id 44022, seq 3, length 64
16:58:22.097767 IP 172.30.0.2.40978 > 172.30.0.1.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.1.0 > 10.42.0.1: ICMP echo request, id 44022, seq 4, length 64
16:58:22.097811 IP 192.168.66.200.55307 > 172.30.0.2.8472: OTV, flags [I] (0x08), overlay 0, instance 1
IP 10.42.0.1 > 10.42.1.0: ICMP echo reply, id 44022, seq 4, length 64

in above packet tcpdump we see not correct source ip address from node1. It need to be 172.30.0.1, but is 192.168.66.200

after some digging it is evident that node1 using from iface for vxlan:

node1:
node1 /sys/devices/virtual/net/flannel.1 # ls -l
total 0
### chopped ls -l ###
--rw-r--r-- 1 root root 4096 Dec 18 17:01 flags
-rw-r--r-- 1 root root 4096 Dec 18 17:01 gro_flush_timeout
-rw-r--r-- 1 root root 4096 Dec 18 17:01 ifalias
-r--r--r-- 1 root root 4096 Dec 18 17:01 ifindex
-r--r--r-- 1 root root 4096 Dec 18 16:56 iflink
-r--r--r-- 1 root root 4096 Dec 18 17:01 link_mode
lrwxrwxrwx 1 root root    0 Dec 18 17:01 lower_eth0 -> ../../../pci0000:00/0000:00:01.3/0000:09:00.2/0000:0a:03.0/0000:0e:00.0/net/eth0
-rw-r--r-- 1 root root 4096 Dec 18 17:01 mtu
-r--r--r-- 1 root root 4096 Dec 18 16:56 name_assign_type
### chopped ls -l ###

as for exmaple node2 flannel.1 directory shows correct iface:

node2 /sys/devices/virtual/net/flannel.1 # ls -l
total 0
### chopped ls -l ###
-rw-r--r-- 1 root root 4096 Dec 18 17:02 ifalias
-r--r--r-- 1 root root 4096 Dec 18 16:58 ifindex
-r--r--r-- 1 root root 4096 Dec 18 16:58 iflink
-r--r--r-- 1 root root 4096 Dec 18 17:02 link_mode
lrwxrwxrwx 1 root root    0 Dec 18 16:58 lower_wg0 -> ../wg0
-rw-r--r-- 1 root root 4096 Dec 18 17:02 mtu
### chopped ls -l ###

Steps To Reproduce:

  • Have at least system one system behind nat, but both systems with restrictive firewall
  • install and configure tunnel between nodes (for example wireguard)
  • Installed K3s (download from release page and just chmod +x)
  • started manually k3s agent & k3s server with parameters above and left with default parameters
  • ping 10.42.0.1 or 10.42.0.2 from opposite node

Expected behavior:

the vxlan tunnel works
Actual behavior:

the vxlan tunnel packets are dropped due to wrong src ip address
Additional context / logs:

@brandond
Copy link
Member

the flannel tunnel from node1 uses wrong src ip. For example doing ping from node2 to node1 the following packets are produced:

-> https://github.com/flannel-io/flannel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done Issue
Development

No branches or pull requests

2 participants