-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k3s service does not start #11234
Comments
What are you using as the database for this cluster? You've provided very little in terms of error messages, but what you're showing here suggests that node you restarted is unable to connect to the database. This has nothing to do with the prune command you ran prior to restarting. Check the nodes connection to the database. |
Thanks for your reply! Yes, there is indeed something wrong with my Postgres database, my request is being rejected by timeout. I have Postgres+patroni installed with haproxy. It turns out that I have 3 nodes in the k3s cluster, postgres with patroni is installed on each node. Here is the configuration of my k3s server. |
I can't really help you with that. You'll need to figure out why K3s can't connect to the database, wherever and however it is hosted. Have you checked to see if postgres and patroni are working properly following the restart? |
I rebooted only the problematic node and, as you know, it did not start after that. The other two nodes are still afraid to reboot, the k3s service is running there, after a reboot, the same situation may occur as with the problematic node. Thank you for your help, I wish you good health! @brandond |
Environmental Info:
K3s Version:
v1.28.6+k3s2
Node(s) CPU architecture, OS, and Version:
3 nodes with similar characteristics
Linux ds89290 5.15.0-124-generic #134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
3 Nodes in the Roles control-plane,master
Describe the bug:
After running the command
k3s crictl rmi --prune
the k3s service does not start.Steps To Reproduce:
Run the command
k3s crictl rmi --prune
and restart the computer.Additional context / logs:
On the problem node I see
The
systemctl status k3s
command shows`k3s.service - Lightweight Kubernetes
Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)
Active: activating (start) since Tue 2024-11-05 12:37:28 UTC; 25min ago
Docs: https://k3s.io
Process: 726 ExecStartPre=/bin/sh -xc ! /usr/bin/systemctl is-enabled --quiet nm-cloud-setup.service 2>/dev/null (code=exited, status=0/S>
Process: 752 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 768 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 777 (k3s-server)
Tasks: 12
Memory: 164.8M
CPU: 972ms
CGroup: /system.slice/k3s.service
└─777 "/usr/local/bin/k3s server" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "
Nov 05 12:43:36 ds89290 k3s[777]: time="2024-11-05T12:43:36Z" level=error msg="failed to ping connection: driver: bad connection"
Nov 05 12:45:38 ds89290 k3s[777]: time="2024-11-05T12:45:38Z" level=error msg="failed to ping connection: driver: bad connection"
Nov 05 12:47:39 ds89290 k3s[777]: time="2024-11-05T12:47:39Z" level=error msg="failed to ping connection: driver: bad connection"`
On the other two nodes I see
k3s.service - Lightweight Kubernetes
Loaded: loaded (/etc/systemd/system/k3s.service; disabled; vendor preset: enabled)
Active: active (running) since Mon 2024-11-04 12:47:15 UTC; 24h ago
Docs: https://k3s.io
Main PID: 3715692 (k3s-server)
Tasks: 514
Memory: 1.6G
CPU: 4min 51.908s
CGroup: /system.slice/k3s.service
├─ 3513 /var/lib/rancher/k3s/data/13f9723ffde84ba41d08658d407a523bcf32698f179c9ab30cc0534e1e5d2c1a/bin/containerd-shim-runc-v2>
├─ 3703 /var/lib/rancher/k3s/data/13f9723ffde84ba41d08658d407a523bcf32698f179c9ab30cc0534e1e5d2c1a/bin/containerd-shim-runc-v2>
├─ 3782 /var/lib/rancher/k3s/data/13f9723ffde84ba41d08658d407a523bcf32698f179c9ab30cc0534e1e5d2c1a/bin/containerd-shim-runc-v2>
├─ 3793 /var/lib/rancher/k3s/data/13f9723ffde84ba41d08658d407a523bcf32698f179c9ab30cc0534e1e5d2c1a/bin/containerd-shim-runc-v2>
├─ 3850 /var/lib/rancher/k3s/data/13f9723ffde84ba41d08658d407a523bcf32698f179c9ab30cc0534e1e5d2c1a/bin/containerd-shim-runc-v2>
├─ 3929 /var/lib/rancher/k3s/data/13f9723ffde84ba41d08658d407a523bcf32698f179c9ab30cc0534e1e5d2c1a/bin/containerd-shim-runc-v2>
...
The text was updated successfully, but these errors were encountered: