-
Notifications
You must be signed in to change notification settings - Fork 2.4k
High CPU and disk read/write, very large (2GB) state.db on k3s 1.22.9 #10306
Replies: 1 comment · 1 reply
-
As you noted, v1.22.9 is a quite old release. There have been a number of improvements to the compact logic made in newer releases, so I suspect that things might be improved by newer versions. However, the messages you're seeing indicate that a combination of slow disk IO and high datastore write volume has filled to the database to the point where compaction can no longer run in the allotted time, and old rows are not being purged. Follow the steps from this comment to figure out what's filling your database: #1575 (comment) - if you're able to share that data here, that would be great. You can then run this query to manually compact the old rows: delete from kine where id in (select id from (select id,name from kine where id NOT IN (select max(id) as id from kine group by name)) as t); Note that you should do all of these things while k3s is not running. |
Beta Was this translation helpful? Give feedback.
All reactions
-
Thanks for your help! I was able to do the manual compaction, and while it didn't reduce the file much (still 2.1GB), the CPU load shown by As for the analysis output, here is what it yielded: sqlite3_analyze:
SQLite compact_rev_key stats:
|
Beta Was this translation helpful? Give feedback.
All reactions
This discussion was converted from issue #10305 on June 06, 2024 18:40.
-
We have many installations (around 200+) of k3s running on 4-CPU x86 hosts (single-node clusters) running Ubuntu 20.x, mostly at customer sites. Recently, one of the customers has been complaining of high CPU usage (around 80% consistently). I dove into it for a couple weeks, and have landed on some key symptoms:
uptime
yields load average typically around 8-9, vs. other customers' systems which usually sit between about 1.0 and 3.0. k3s as highest CPU user pertop
dstat -d
- reads of 30-60MB/s consistently, and write between 3-20MB consistently, vs. other systems consistently under 1 MB/s. k3s as highest disk user pertop
/var/lib/rancher/k3s/server/db/state.db
is > 2GB, vs. < 50 MB on other systems.journalctl -e -u k3s
most of the logs are Trace, mixed with some errors such as "Compact failed" and "context deadline exceeded" (more detail on logs below)I tried the vacuum steps mentioned in #1575 (comment) , which brought it from about 2.7GB down to 2.1GB, but otherwise didn't seem to benefit much.
I realize 1.22.9 is a few years old, but I'm hoping we can get it fixed without upgrading given the number of installations in the field, and I'm also hoping maybe this is just something simple I haven't realized yet.
Environmental Info:
K3s Version:
k3s version v1.22.9+k3s1 (8b0b50a)
go version go1.16.10
Node(s) CPU architecture, OS, and Version:
Linux QX7500B5 5.4.0-182-generic #202-Ubuntu SMP Fri Apr 26 12:29:36 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
Single node
Describe the bug:
See above
Steps To Reproduce:
Unsure exact steps. I don't think we're doing too much crazy with the install/config of k3s, but LMK if there is anything relevant to this issue and I can include details.
Expected behavior:
System should run at a reasonable CPU utilization level (e.g. 20-30%) and load average of 2-3. Disk usage per
dstat -d
under 1MB/s most of the time. Reasonably unpolluted system logs.Actual behavior:
See above
Additional context / logs:
From
journalctl -e -u k3s
:From
journalctl -e u k3s | grep -v Trace
:dstat -d:
Beta Was this translation helpful? Give feedback.
All reactions