Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch default cpu governor to schedutil #6120

Merged

Conversation

lanefu
Copy link
Member

@lanefu lanefu commented Dec 31, 2023

Description

The schedutil cpu frequency governor is more tightly coupled to userspace activity and performs more granular cpu frequency changes. I've personally be using it for years. I think it's prudent to change to this default.. especially given cpu-frequency utils is deprecated and no-longer enabled by default on armbian builds.

I think if this is merged it's an opportunity to also remove some of the unmaintained on-demand governor tweaks in armbian-hardware-optimize as well.

Quoting the kernel docs:

This governor generally is regarded as a replacement for the older ondemand and conservative governors (described below), as it is simpler and more tightly integrated with the CPU scheduler, its overhead in terms of CPU context switches and similar is less significant, and it uses the scheduler's own CPU utilization metric, so in principle its decisions should not contradict the decisions made by the other parts of the scheduler.

Copy link
Contributor

@schwar3kat schwar3kat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@paolosabatino paolosabatino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the schedutil idea and approve it, but I read it still has some performance issues versus on-demand: https://www.phoronix.com/review/schedutil-quirky-2023

Perhaps they are niche cases, but maybe some benchmarks with sbc-bench are worth it, at least for curiosity, since things may have been changed in kernel 6.6 with EEVDF in place of old CFS scheduler.

@lanefu
Copy link
Member Author

lanefu commented Jan 1, 2024

Perhaps they are niche cases, but maybe some benchmarks with sbc-bench are worth it

Would need to modify SBC bench as it turns governors to performance when executing

Copy link
Member

@rpardini rpardini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been personally using schedutil for a while with good success and have similar half-PRs unsent for months now. Thanks for sending. Merge.

@igorpecovnik igorpecovnik merged commit a4d3dcb into armbian:main Jan 1, 2024
@lanefu
Copy link
Member Author

lanefu commented Jan 1, 2024

I think if this is merged it's an opportunity to also remove some of the unmaintained on-demand governor tweaks in armbian-hardware-optimize as well.

Did some benchmarks... there's plenty ways to split hairs but my own conclusion is that schedutil + irqbalance gives us good out of box performance without having to worry about maintaining armbian-hardware-optimize

https://docs.lane-fu.com/s/ZclNlyJQe#SCHEDUTIL-Governor

@ThomasKaiser
Copy link
Contributor

ThomasKaiser commented Apr 18, 2024

Quoting https://docs.lane-fu.com/s/ZclNlyJQe#SCHEDUTIL-Governor

yeah sure even with the broken armbian-hardware-optimize defaults (for my miscategorized RK3399 board) ondemand did slightly better on geekbench.

Nope, the way Geekbench executes the tests is that flawed that it's almost like adding some randon numbers to the results. You need to run this tool at least 10 times in a row to get anything you can work with (not even blindly averaging all results helps since sometimes you just see that the test execution itself has an influence on performance and you need to forget about the first 2 or 3 runs).

See for example the 10 'Raspberry Pi 5 Model B Rev 1.0 (3.08 GHz)' entries here at the top: https://browser.geekbench.com/user/tkaiser (5 times sbc-bench -G directly fired up in a row that always tests twice to demonstrate how unreliable Geekbench is when it's about generating useful scores, e.g. you want to optimize something and are looking for a performance diff of 1%, 2% or even 5% – impossible with Geekbench)

And see this explanation

However the results of Schedutil paired with Armbian Hardware Optimize Disabled + Irqbalance enabled came really close and disk benchmarks looked better.

Nope, what your 6 fio tests really show is this instead:

  • "armbian-hardware-optimize enabled" identical worse regardless of governor
  • "Armbian Hardware Optimize Disabled" all the same (better) regardless of governor or irqbalance

As such you will have done some further checking what causes "armbian-hardware-optimize enabled" resulting in slower MMC benchmark scores, maybe the set_io_scheduler function doing silly things? And then there's standard deviation and also these tests would've to be repeated multiple times to get any insights since when numbers only differ by few % how will you know when it's a change in settings and when just result variation?

Asides that no idea how these two benchmarks each executed only once per 'setting' in 'fire and forget' mode should be able to justify any performance assessment wrt cpufreq governor? Have you compared with performance to check whether these benchmarks are affected by cpufreq governor at all?

It seems this governor switch is only based on feelings and quoted documents that are +99% based on situation on x86, right?

And even if I know that none of you guys cares about such stuff: congratulations, you completely ruined Armbian NAS 'performance' compared to 2023 settings with this single change: https://github.com/ThomasKaiser/Knowledge/blob/master/articles/Quick_Preview_of_ROCK_5_ITX.md#cpufreq-governor-schedutil-dmc-governor-set-to-dmc_ondemand-armbian-defaults-with-rk3588-today

Would be fair to announce Armbian not caring about 'energy efficient server stuff' any more (unlike a decade when this was the primary goal) but only about 'Desktop Linux' or whatever today. Though I know that won't happen too

@rpardini
Copy link
Member

@ThomasKaiser this would all been valid and fruitful discussion had you engaged during the PR's lifetime -- now, necro apart, please be respectful; the change is in as it was the best solution we (however poorly) evaluted, being generic enough. Your "NAS" use-case, albeit common, is not the only one.

The main problem here resides in the fact that the relevant changes/settings are either in kernel configs/patches or in a monolithic 1990's style single script filled with conditionals -- which ofc has become completely unmaintainable, and to which you've never proposed architectural changes, only more conditionals. So "congratulations" to you too.

When I have enough cycles to spare, I'd look into how to make such settings easily doable in userpatches/extensions, so you can actually contribute to say an extensions/tkaiser-optimized-nas.sh instead of constantly blaming generic/core Armbian.

See you in a separate PR 🖖

@ThomasKaiser
Copy link
Contributor

this would all been valid and fruitful discussion had you engaged during the PR's lifetime

LOL! PR sent on 31th of Dec, merged on 1st Jan. Even if I would be joining this project at this time of the year I spend time with family. And the stuff I have commented on (the benchmarking methodology) happened after the merge.

But good to know that zero evaluation has been done how the switch to schedutil affects performance of real world scenarios (benchmarks like silly Geekbench that are designed to constantly and fully utilize CPU cores should always be able to ramp up clockspeeds to the max regardless of governor and results should be the same except for powersave on platforms != x86_64. On x86_64 for which vast majority of governor documentation has been written some things are different especially with recent versions of intel_pstate/amd-pstate drivers)

Your "NAS" use-case, albeit common, is not the only one.

I know very well and wonder which testing has been done to evaluate governor change other than 'I use this governor personally and am fine' and 'some doc says it should be better' (or in reality mostly related to cpufrequtils and this ugly unmaintained script)?

And of course nobody will revisit the question the fio benchmark brought up: what causes storage performance to suffer from armbian-hardware-optimization being executed? Inappropriate I/O scheduler settings? Something else?

@ThomasKaiser
Copy link
Contributor

ThomasKaiser commented Apr 18, 2024

Just for laughs...

to also remove some of the unmaintained on-demand governor tweaks in armbian-hardware-optimize as well.

Holy cow. The real problem is not even a single member of 'Team Armbian' today wanting to wrap his head around this nonsense for a few seconds since nobody cares about this sort of optimization for 6 years now. This is the 'generic' approach for this 'problem': https://github.com/ThomasKaiser/build/blob/9c0bc9db7ac89d267911c8e1e981ceea447dcd78/packages/bsp/common/usr/lib/armbian/armbian-hardware-optimization#L75-L83

Works only for 1 to 100 CPU cores. In case you want to deal with more it's time to add another string cpu[1-9][1-9][0-9]/cpufreq/ondemand.

but I read it still has some performance issues versus on-demand: https://www.phoronix.com/review/schedutil-quirky-2023

@paolosabatino appreciate that you even mentioned potential performance issues with this governor. But... this was on x86, this was Phoronix and why would this apply to the situation on ARM?

Michael Larabel wrote:

The performance governor on the Threadripper 3990X while running the video encode benchmarks actually had comparable power use to ondemand and in turn was lower overall on average than the acpi-cpufreq schedutil configuration used by default.

How should this be possible: schedutil consuming more power than performance?

On ARM performance keeps all CPU cores at their maximum clockspeed (associated with maximum consumption) while schedutil, interactive, ondemand and friends try to adjust clockspeeds depending on utilization between min and max and as such it's impossible (on ARM) to get higher consumption with any other governor compared to performance. Situation on x86_64 is different, guys!

And the same 'logic problem' applies to the benchmark expectations. Switching cpufreq governor affects how clockspeeds behave in real world situations. How fast will they ramp up when a user moves the mouse (desktop use case) or copies medium sized files to a server (NAS use case) or will be able to answer web requests (web server)?

Most benchmarks are pretty much unaffected by this since they try to fully utilize the CPU cores anyway which results in every governor (not being completely broken) in ramping up clockspeeds to the max. That's why you need to always test with performance too in such situations since 'benchmarking the benchmark' is the first step to identify whether you're just wasting your time or not.

Talking about the latter: another homework done for you @lanefu https://browser.geekbench.com/user/tkaiser

And apologies for sounding rude as usual: I don't care. At least we have documented together Armbian having switched to an entirely different target audience (DietPi + desktop?) than in the past and the situation for people wanting to run a small and energy efficient server on ARM hardware is pretty much fucked up since crap like this will happen anytime again.

[comment archived]

@ThomasKaiser
Copy link
Contributor

ThomasKaiser commented Apr 19, 2024

Testing method: curl -sL yabs.sh | bash -s -- -i -g (mixed read/write workload) followed by iozone -e -I -a -s 100M -r 4k -r 16384k -i 0 -i 1 -i 2 | cut -c7-78 | tail -n6 | head -n4 (individually testing sequential and random I/O):

Testing devices: 2 SDIO connected MMC devices

Testing host: Rock 5 ITX with 5.10.160 legacy kernel

Testing focus: how does the set_io_scheduler function in armbian-hardware-optimization negatively affects real world storage performance (problem discovered on 2024/01/01 by @lanefu but obviously not understood)

256 GB Samsung EVO Plus A2 / UHS-I/SDR104

Reported by sbc-bench -S as 238.8GB "Samsung EE4S5" UHS SDR104 SDXC card as /dev/mmcblk1: date 05/2023, manfid/oemid: 0x00001b/0x534d, hw/fw rev: 0x3/0x0

performance cpufreq governor, mq-deadline I/O scheduler (Kernel default):

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/mmcblk1p2):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 2.48 MB/s      (621) | 12.33 MB/s     (192)
Write      | 2.49 MB/s      (624) | 12.92 MB/s     (201)
Total      | 4.98 MB/s     (1.2k) | 25.25 MB/s     (393)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 29.79 MB/s      (58) | 34.30 MB/s      (33)
Write      | 32.33 MB/s      (63) | 38.26 MB/s      (37)
Total      | 62.13 MB/s     (121) | 72.57 MB/s      (70)

YABS completed in 1 min 44 sec
                                                    random    random
    kB  reclen    write  rewrite    read    reread    read     write
102400       4     2713     2781    12878    13539    10743     2778
102400   16384    63574    64109    89019    88998    88993    62516

ondemand cpufreq governor with tweaks, mq-deadline I/O scheduler (Kernel default):

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/mmcblk1p2):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 2.33 MB/s      (584) | 12.11 MB/s     (189)
Write      | 2.35 MB/s      (589) | 12.71 MB/s     (198)
Total      | 4.69 MB/s     (1.1k) | 24.82 MB/s     (387)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 29.81 MB/s      (58) | 34.23 MB/s      (33)
Write      | 32.36 MB/s      (63) | 38.19 MB/s      (37)
Total      | 62.17 MB/s     (121) | 72.42 MB/s      (70)

YABS completed in 1 min 45 sec
                                                    random    random
    kB  reclen    write  rewrite    read    reread    read     write
102400       4     2825     2754    12389    12158     9929     2632
102400   16384    65507    62404    88925    88464    88850    62786

performance cpufreq governor, none I/O scheduler (Armbian):

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/mmcblk1p2):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 2.09 MB/s      (524) | 11.84 MB/s     (185)
Write      | 2.11 MB/s      (528) | 12.35 MB/s     (193)
Total      | 4.21 MB/s     (1.0k) | 24.19 MB/s     (378)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 28.85 MB/s      (56) | 33.44 MB/s      (32)
Write      | 31.32 MB/s      (61) | 37.31 MB/s      (36)
Total      | 60.17 MB/s     (117) | 70.75 MB/s      (68)

YABS completed in 1 min 45 sec
                                                    random    random
    kB  reclen    write  rewrite    read    reread    read     write
102400       4     2759     2795    13914    14402    10925     2801
102400   16384    64409    63291    89442    87173    89354    64725

ondemand cpufreq governor with tweaks, none I/O scheduler (Armbian):

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/mmcblk1p2):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 2.10 MB/s      (526) | 11.90 MB/s     (186)
Write      | 2.12 MB/s      (531) | 12.43 MB/s     (194)
Total      | 4.23 MB/s     (1.0k) | 24.34 MB/s     (380)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 28.96 MB/s      (56) | 33.28 MB/s      (32)
Write      | 31.44 MB/s      (61) | 37.13 MB/s      (36)
Total      | 60.40 MB/s     (117) | 70.41 MB/s      (68)

YABS completed in 1 min 47 sec
                                                    random    random
    kB  reclen    write  rewrite    read    reread    read     write
102400       4     2832     2737    13632    13496    10255     2638
102400   16384    64155    62568    89379    85709    89445    63022

32 GB Samsung eMMC / HS400

Reported by sbc-bench -S as 29.1GB "Samsung BJTD4R" HS400 Enhanced strobe eMMC 5.1 card as /dev/mmcblk0: date 09/2023, manfid/oemid: 0x000015/0x0100, hw/fw rev: 0x0/0x0300000000000000

performance cpufreq governor, mq-deadline I/O scheduler (Kernel default):

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/mmcblk0p1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 17.98 MB/s    (4.4k) | 48.15 MB/s     (752)
Write      | 17.97 MB/s    (4.4k) | 49.58 MB/s     (774)
Total      | 35.96 MB/s    (8.9k) | 97.73 MB/s    (1.5k)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 67.11 MB/s     (131) | 70.09 MB/s      (68)
Write      | 72.85 MB/s     (142) | 78.20 MB/s      (76)
Total      | 139.97 MB/s    (273) | 148.29 MB/s    (144)

YABS completed in 1 min 4 sec
                                                    random    random
    kB  reclen    write  rewrite    read    reread    read     write
102400       4    32999    44313    23939    24462    24021    42385
102400   16384   109130   108452   299802   304475   304937   108222

ondemand cpufreq governor with tweaks, mq-deadline I/O scheduler (Kernel default):

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/mmcblk0p1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 17.98 MB/s    (4.4k) | 48.31 MB/s     (754)
Write      | 17.97 MB/s    (4.4k) | 49.75 MB/s     (777)
Total      | 35.96 MB/s    (8.9k) | 98.07 MB/s    (1.5k)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 67.62 MB/s     (132) | 69.83 MB/s      (68)
Write      | 73.40 MB/s     (143) | 77.91 MB/s      (76)
Total      | 141.03 MB/s    (275) | 147.74 MB/s    (144)

YABS completed in 1 min 4 sec
                                                    random    random
    kB  reclen    write  rewrite    read    reread    read     write
102400       4    29483    41739    23117    22968    22960    40523
102400   16384   108860   109079   309276   309018   308988   109607

performance cpufreq governor, none I/O scheduler (Armbian):

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/mmcblk0p1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 11.87 MB/s    (2.9k) | 37.09 MB/s     (579)
Write      | 11.87 MB/s    (2.9k) | 38.20 MB/s     (596)
Total      | 23.75 MB/s    (5.9k) | 75.30 MB/s    (1.1k)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 63.03 MB/s     (123) | 67.67 MB/s      (66)
Write      | 68.42 MB/s     (133) | 75.51 MB/s      (73)
Total      | 131.46 MB/s    (256) | 143.18 MB/s    (139)

YABS completed in 1 min 9 sec
                                                    random    random
    kB  reclen    write  rewrite    read    reread    read     write
102400       4    75158    60720    24565    24675    24453    51035
102400   16384   107822   107994   309762   311168   309351   108327

ondemand cpufreq governor with tweaks, none I/O scheduler (Armbian):

fio Disk Speed Tests (Mixed R/W 50/50) (Partition /dev/mmcblk0p1):
---------------------------------
Block Size | 4k            (IOPS) | 64k           (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 11.88 MB/s    (2.9k) | 37.55 MB/s     (586)
Write      | 11.88 MB/s    (2.9k) | 38.67 MB/s     (604)
Total      | 23.76 MB/s    (5.9k) | 76.23 MB/s    (1.1k)
           |                      |                     
Block Size | 512k          (IOPS) | 1m            (IOPS)
  ------   | ---            ----  | ----           ---- 
Read       | 63.43 MB/s     (123) | 67.55 MB/s      (65)
Write      | 68.86 MB/s     (134) | 75.36 MB/s      (73)
Total      | 132.29 MB/s    (257) | 142.91 MB/s    (138)

YABS completed in 1 min 9 sec
                                                    random    random
    kB  reclen    write  rewrite    read    reread    read     write
102400       4    35856    54851    32157    32209    31966    48897
102400   16384   108402   108275   301238   303010   307528   108479

[comment archived]

@lanefu
Copy link
Member Author

lanefu commented Apr 19, 2024

I knew the armbian-selected io_scheduler that was the problem. I had originally intended follow-up with a 2nd PR to address a "simplification" of armbian-hardware-optimize.

As we know, armbian-hardware-optimize is not maintained, and nobody looks into it. (Mild Exceptions being JetHome, Oleg at some point in the past, and I think an effort was made for some Rk3588)

Inspired by prior feedback about getting rid of it since nobody maintains it--My real thinking was "can we simplify to some new sane defaults with good enough performance and live without it?" Then passionate people could then go add their own optimizations as they see fit. Or as Ricardo pointed out, a more intuitive mechanism for tying this things to a board configuration rather than burying into a monolithic script in the build system.

There should have been more conversation. There wasn't. I should have finished the job. I didn't.

I've been using an inelegant extension that satisfies my own concerns verified through extensive non-rigorous seat-of-the-pants unscientific testing. Yes I'm aware that irqbalance doesn't consider big little scenarios. Yes I'm aware that it's default setting is to periodically re-run and re-balance.

@lanefu
Copy link
Member Author

lanefu commented Apr 19, 2024

@ThomasKaiser

I understand there's some intent woven in this conversation to teach some of the technical considerations, examples of testing methodologies, and demonstrate the creation of a good paper trail for later review.

...but that high-value information is hard to focus on as it mostly it just feels like being poked, hit on head with shovel, then handed a bunch of reports to read.

This conversation leaves me exhausted and extremely uninspired. Going to do my best to not let it steal from my weekend.

@ThomasKaiser
Copy link
Contributor

ThomasKaiser commented Apr 20, 2024

especially given cpu-frequency utils is deprecated and no-longer enabled by default on armbian builds.

I guess you were talking about cpufrequtils package? Even if none of you guys is interested in such stuff I tell you a story from Armbian's history (or to me more precise: Mikhail's an mine history of tweaking cpufreq settings – nobody else at Armbian ever had an interested in or was involved in such stuff)

The most important part of cpufrequtils package usage is setting MIN_SPEED for the following reason: some kernels (or DT, with early 3.x kernel we dealt with vendor proprietary descriptions like Allwinner's fex and so on) define an awful lot of cpufreq OPP, some of them for no actual reason really low which results in a wide variety of available cpufreqs, e.g. 2000MHz down to 100 MHz w/o any benefit since below a specific treshold there is no difference in idle consumption any more.

Personally tested with interactive, ondemand and schedutil (the latter IIRC with kernel 4.6 around 2016): once the range of available cpufreq OPP extends too wide and to very low ones the whole performance characteristics change and result in a system behaving significantly slower. Simple solution: just mask all those low cpufreq OPP away by using cpufrequtils and the MIN_SPEED setting and afterwards you can enjoy a system that behaves snappy again (you can check old Github issues discussing this and maybe also somewhere in the forums where Blockwart Werner constantly shuffles contents around so that nobody finds anything anymore and all old links are broken now)

I have revisited schedutil few times since then w/o any scientific approach since simple kitchen-sink benchmarking always resulted with lower performance compared to ondemand. Testing done on ARM unklike the rest of the world doing it on x86 (where most of the real stuff is controlled a layer below anyway). And testing of course not done with stuff like Geekbench since how on earth should a benchmark designed to fully utilize CPU cores tells you anything about cpufreq scaling behaviour in real world situations.

So what to do when cpufrequtils becomes deprecated? Simply replace it, the functionality used by Armbian was a joke anyway and it takes 9 lines of script code to fully replace it: ThomasKaiser@0343445

Does it work?

root@rock-5b:/home/tk# ls -la /etc/init.d/cpufrequtils
ls: cannot access '/etc/init.d/cpufrequtils': No such file or directory

root@rock-5b:/home/tk# cat /etc/default/cpufrequtils 
ENABLE=false
MIN_SPEED=508000
MAX_SPEED=1800000
GOVERNOR=ondemand

root@rock-5b:/home/tk# grep . /sys/devices/system/cpu/cpufreq/policy6/* 2>/dev/null
/sys/devices/system/cpu/cpufreq/policy6/affected_cpus:6 7
/sys/devices/system/cpu/cpufreq/policy6/cpuinfo_cur_freq:816000
/sys/devices/system/cpu/cpufreq/policy6/cpuinfo_max_freq:2208000
/sys/devices/system/cpu/cpufreq/policy6/cpuinfo_min_freq:408000
/sys/devices/system/cpu/cpufreq/policy6/cpuinfo_transition_latency:356000
/sys/devices/system/cpu/cpufreq/policy6/related_cpus:6 7
/sys/devices/system/cpu/cpufreq/policy6/scaling_available_frequencies:408000 600000 816000 1008000 1200000 1416000 1608000 1800000 2016000 208000 
/sys/devices/system/cpu/cpufreq/policy6/scaling_available_governors:conservative ondemand userspace powersave performance schedutil 
/sys/devices/system/cpu/cpufreq/policy6/scaling_cur_freq:816000
/sys/devices/system/cpu/cpufreq/policy6/scaling_driver:rockchip-cpufreq
/sys/devices/system/cpu/cpufreq/policy6/scaling_governor:ondemand
/sys/devices/system/cpu/cpufreq/policy6/scaling_max_freq:1800000
/sys/devices/system/cpu/cpufreq/policy6/scaling_min_freq:600000
/sys/devices/system/cpu/cpufreq/policy6/scaling_setspeed:<unsupported>

root@rock-5b:/home/tk# grep . /sys/devices/system/cpu/cpufreq/policy6/ondemand/*
/sys/devices/system/cpu/cpufreq/policy6/ondemand/ignore_nice_load:0
/sys/devices/system/cpu/cpufreq/policy6/ondemand/io_is_busy:1
/sys/devices/system/cpu/cpufreq/policy6/ondemand/powersave_bias:0
/sys/devices/system/cpu/cpufreq/policy6/ondemand/sampling_down_factor:10
/sys/devices/system/cpu/cpufreq/policy6/ondemand/sampling_rate:200000
/sys/devices/system/cpu/cpufreq/policy6/ondemand/up_threshold:25

Is it future proof and works on any system? Sure, but the ondemand tweaks only for up to 100 CPU cores as an excercise to check whether any of you guys now even tries to wrap their heads around this. I guess not as 'Team Armbian' since 6 years has zero members with interest/motivation or even basic knowledge in this area?

[comment archived]

@token0
Copy link

token0 commented Apr 20, 2024

I was using schedutil for few years on aarch64 board, but reconsidered and went to ondemand as simpler, snappier and less-overhead governor. I don't see any profit in using any other governors if you aren't on battery power.

@ThomasKaiser
Copy link
Contributor

ThomasKaiser commented Apr 20, 2024

went to ondemand as simpler, snappier and less-overhead governor. I don't see any profit in using any other governors if you aren't on battery power.

W/o having a specific use case in mind IMO it's always hard to talk about those things. And we need to keep in mind that schedutil is the only governor being able to make use of EAS (Energy Aware Scheduling) and as such should be the only reasonable choice on hybrid platforms: in our case big.LITTLE/DynamIQ.

But for this to work properly (or in any reasonable fashion) prerequisits need to be met such as DT properties for cores like capacity-dmips-mhz, dynamic-power-coefficient and/or sched-energy-costs. W/o those or with wrong values assigned the whole scheduling will

  • either not work correctly (the scheduler not choosing the right cores for a task)
  • or outright wrong (the scheduler choosing the wrong cores for a task)

Back in the days when I joined Armbian it wasn't called Armbian but known as "Igor's image" and the focus was primarily on small, energy efficient servers as such the 'design decisions' back then: servers idle a lot so we needed settings that kept consumption overall low but ramped up clockspeeds immediately when needed. schedutil didn't exist at that time (maybe in experimental stage) and also big.LITTLE designs weren't supported yet, irqbalanced did nothing and for the 'server use case' static IRQ affinity was best. As such days/weeks of testing resulted in those settings that were valid until 2024 (with the exception of added platforms from 2019 on where almost nobody took care any more about IRQ affinity and big.LITTLE designs where the big cluster policy wasn't called policy4 got no ondemand tweaks due to deliberate ignorance).

I have no idea what Armbian is officialy about today (DietPi + desktop? 'Linux with desktop for people who can't or want to afford x86'? Something else?) but what I know is the decision to change default governor was based on zero evaluation.

[comment archived]

@sfx2000
Copy link
Collaborator

sfx2000 commented Apr 21, 2024

@ThomasKaiser makes good points - it's really about the use-case for the device.

OnDemand is still a good default... SchedUtil is a second best estimate/guess...

It depends on the SoC/Board and the purpose - and there, it's thermals, and the OPP's...

@ThomasKaiser
Copy link
Contributor

ThomasKaiser commented Apr 24, 2024

SchedUtil is a second best estimate/guess...

@sfx2000 as mentioned before this scheduler needs some information about the cores in hybrid designs to work correctly. At the moment I would believe among the fleet of 'Armbian supported' devices schedutil has only a chance to behave correctly on RK3399 boards since on others needed properties are missing or wrong or both.

@ThomasKaiser
Copy link
Contributor

Talking about proper properties giving schedutil the chance to work somewhat correctly (EAS) let's have a look at four SoCs with A55 and A76 inside:

RK3588 (A76 1.9 times faster than A55):

  • A55: capacity-dmips-mhz = <530>; dynamic-power-coefficient = <100>;
  • A76: capacity-dmips-mhz = <1024>; dynamic-power-coefficient = <300>;

Amlogic S928X (A76 exactly two times faster)

  • A55: capacity-dmips-mhz = <512>; + dynamic-power-coefficient = <1024>;
  • A76: capacity-dmips-mhz = <1024>; + dynamic-power-coefficient = <512>;

Unisoc UMS9620 (A76 2.45 times faster than A55)

  • A55: capacity-dmips-mhz = <417>; sched-energy-costs = <&CPU_COST_0 &CLUSTER_COST_0>;
  • A76: capacity-dmips-mhz = <1022>; sched-energy-costs = <&CPU_COST_1 &CLUSTER_COST_1>;

Google G1 (A76 2.5 times faster than A55):

  • A55: capacity-dmips-mhz = <250>; dynamic-power-coefficient = <70>;
  • A76: capacity-dmips-mhz = <620>; dynamic-power-coefficient = <284>;
  • X1: capacity-dmips-mhz = <1024>; dynamic-power-coefficient = <650>;

Funny! We just learned that A76 and A55 perform differently based on which SoC vendor is using them.

The most amusing properties are those from Amlogic of course, obviously the person having added those has no idea what this whole stuff is about and just decided to use a nicely looking 1024, divide it by 2 for fun and then throw those two numbers at random DT locations.

Google/Samsung seem to have actually measured something while the 100/300 value pair from Rockchip for the RK3588 is obviously just a wild guess.

The clear winner is Unisoc since they provide detailed power costs.

But why do the capacity-dmips-mhz differ between SoC vendors? And are these numbers really based on DMIPS since everybody knows running Dhrystone just produces BS not just these days but since decades already. Turns out SoC vendors actually use dhrystone, see https://lore.kernel.org/linux-devicetree/[email protected]/T/ for example. LMAO!

Let's use the popular Geekbench 6 suite (though giving questionable numbers on any platform other than x86) and have a look at the A76 and A55 both at 1800 MHz in RK3588 (with the LPDDR at 4800 MT/s): MaxKHz=1800000 Netio=powerbox-1/4 sbc-bench.sh -G (sbc-bench executes Geekbench multiple times and on hybrid systems also on each CPU cluster individually)

A76 single-threaded 3.23 times faster than A55 at same clock: https://browser.geekbench.com/v6/cpu/compare/5841948?baseline=5842066 (full results)

Bildschirmfoto 2024-04-24 um 15 29 27 klein

3.2 times faster or only 1.9 times faster according to Rockchip's DT creator with power costs obviously not the result of any measurement but just 2 random numbers thrown into DT: good luck schedutil to do your energy aware scheduling magic (just kidding: in reality it's 'to work somewhat correctly at all')!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Hardware Hardware related like kernel, U-Boot, ... size/medium PR with more then 50 and less then 250 lines
Development

Successfully merging this pull request may close these issues.

8 participants