Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pods unable to communicate between each other in single-node cluster when node goes offline #10785

Closed
NGTOne opened this issue Aug 30, 2024 · 3 comments

Comments

@NGTOne
Copy link

NGTOne commented Aug 30, 2024

Environmental Info:
K3s Version:

k3s version 1.29.5+k3s1 (4e53a323)
go version go1.21.9

Node(s) CPU architecture, OS, and Version:

Linux nvidia-desktop 5.10.104-tegra #1 SMP PREEMPT Tue May 16 10:43:59 CEST 2023 aarch64 aarch64 aarch64 GNU/Linux

Cluster Configuration:
Single server node, running in a "semi-airgapped" configuration. Has both 4G and WiFi capability, but is deployed aboard a vehicle in remote locations, meaning frequent network outages and changes are expected.

Describe the bug:
When no Internet connection is present, network requests between Pods stop working for an unclear reason. Example is gRPC requests: they fail with a timeout rather than succeeding. Connecting the device to a network again causes requests to begin succeeding with no apparent other changes.

Steps To Reproduce:

  • Installed K3s:
ExecStart=/usr/local/bin/k3s \
    server \
	'--tls-san' \
	'10.43.0.1' \
	'--resolv-conf' \
	'/run/systemd/resolve/resolv.conf' \
	'--prefer-bundled-bin' \

Dummy network:

dummy0: flags=195<UP,BROADCAST,RUNNING,NOARP>  mtu 1500
        inet 192.168.255.254  netmask 255.255.255.254  broadcast 0.0.0.0
        inet6 fe80::86e:a8ff:fe8e:91ed  prefixlen 64  scopeid 0x20<link>
        ether 0a:6e:a8:8e:91:ed  txqueuelen 1000  (Ethernet)

ip route output:

default via 192.168.20.1 dev wlan0 proto dhcp metric 600 
default via 192.168.255.254 dev dummy0 metric 50000 
10.42.0.0/24 dev cni0 proto kernel scope link src 10.42.0.1 
169.254.0.0/16 dev dummy0 scope link metric 1000 
192.168.20.0/24 dev wlan0 proto kernel scope link src 192.168.20.82 metric 600 
192.168.255.254/31 dev dummy0 proto kernel scope link src 192.168.255.254 

Expected behavior:
Pods remain able to communicate when device goes offline.

Actual behavior:
Pods are unable to communicate when device goes offline.

Additional context / logs:
Not sure what to provide.

@dereknola
Copy link
Member

You are advertising your servers on '--tls-san' '10.43.0.1' , but that's the default cidr for K8s services. This is likely conflicting with your pods communication.

@NGTOne
Copy link
Author

NGTOne commented Sep 2, 2024

I removed --tls-san from the startup arguments. No change. The Pods are still unable to communicate with each other when the device is offline.

@caroline-suse-rancher caroline-suse-rancher moved this from New to In Triage in K3s Development Oct 8, 2024
Copy link
Contributor

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 45 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Oct 31, 2024
@github-project-automation github-project-automation bot moved this from In Triage to Done Issue in K3s Development Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done Issue
Development

No branches or pull requests

2 participants