You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While per-process cpu wait is a useful measure, the total or per-cpu wait charts are a different story. They can make it look like the system is io-limited when it isn't.
They're populated from /proc/schedstat and show the "sum of all time spent waiting to run by tasks on [each] processor". So two tasks waiting at the same time are counted twice, right? And while knowing that one task waits while another runs is only interesting on a per-task level, not per-cpu. So I often see both the CPU utilization and CPU wait charts saturated. As far as I can tell, that just means I'm cpu-limited but would be io-limited with a faster processor and that parallel service startup is doing me a favour? Only the first of those insights is really useful, the last two are academic at best and confusing at worst (it's easy to interpret that as meaning I'm io-limited).
iowait from /proc/stat seems a more useful measure for a global or per-cpu chart - as it only shows time spent waiting for tasks when there's no other work to do.
Finally, I see that the svg graphing code normalises waittime to the scheduling interval, but the double counting etc. means that waittime can easily exceed the scheduling interval and be truncated.
Maybe I'm misunderstanding something - I don't see where to find the bootchart history before it was merged in systemd 83fdc450aa8f79941bec84488ffd5bf8eadab18e so maybe the motivation for using /proc/schedstat is explained in the original repo?
The text was updated successfully, but these errors were encountered:
From the v4 documentation: "Last three are statistics dealing with scheduling latency: [...] The last three make it possible to find the average latency on a particular runqueue or the system overall. Given two points in time, A and B, (22B - 22A)/(23B - 23A) will give you the average time processes had to wait after being scheduled to run but before actually running. "
Of course, these lines were dropped in subsequent versions of the schedstat format docs, but it may well be that the per-cpu time is only meant as an input to an average not for the raw figure to be particularly meaningful.
While per-process cpu wait is a useful measure, the total or per-cpu wait charts are a different story. They can make it look like the system is io-limited when it isn't.
They're populated from /proc/schedstat and show the "sum of all time spent waiting to run by tasks on [each] processor". So two tasks waiting at the same time are counted twice, right? And while knowing that one task waits while another runs is only interesting on a per-task level, not per-cpu. So I often see both the CPU utilization and CPU wait charts saturated. As far as I can tell, that just means I'm cpu-limited but would be io-limited with a faster processor and that parallel service startup is doing me a favour? Only the first of those insights is really useful, the last two are academic at best and confusing at worst (it's easy to interpret that as meaning I'm io-limited).
iowait from /proc/stat seems a more useful measure for a global or per-cpu chart - as it only shows time spent waiting for tasks when there's no other work to do.
Finally, I see that the svg graphing code normalises waittime to the scheduling interval, but the double counting etc. means that waittime can easily exceed the scheduling interval and be truncated.
Maybe I'm misunderstanding something - I don't see where to find the bootchart history before it was merged in systemd 83fdc450aa8f79941bec84488ffd5bf8eadab18e so maybe the motivation for using /proc/schedstat is explained in the original repo?
The text was updated successfully, but these errors were encountered: