NO_HZ_IDLE causes consistently low cpu "iowait" time (and higher cpu "idle" time)
From: Alan Jenkins
Date: Mon Jul 01 2019 - 11:33:58 EST
Hi
I tried running a simple test:
ÂÂÂ dd if=testfile iflag=direct bs=1M of=/dev/null
With my default settings, `vmstat 10` shows something like 85% idle time
to 15% iowait time. I have 4 CPUs, so this is much less than one CPU
worth of iowait time.
If I boot with "nohz=off", I see idle time fall to 75% or below, and
iowait rise to about 25%, equivalent to one CPU. That is what I had
originally expected.
(I can also see my expected numbers, if I disable *all* C-states and
force polling using `pm_qos_resume_latency_us` in sysfs).
The numbers above are from a kernel somewhere around v5.2-rc5. I saw
the "wrong" results on some previous kernels as well. I just now
realized the link to NO_HZ_IDLE.[1]
[1]
https://unix.stackexchange.com/questions/517757/my-basic-assumption-about-system-iowait-does-not-hold/527836#527836
I did not find any information about this high level of inaccuracy. Can
anyone explain, is this behaviour expected?
I found several patches that mentioned "iowait" and NO_HZ_IDLE. But if
they described this problem, it was not clear to me.
I thought this might also be affecting the "IO pressure" values from the
new "pressure stall information"... but I am too confused already, so I
am only asking about iowait at the moment :-).[2]
[2]
https://unix.stackexchange.com/questions/527342/why-does-the-new-linux-pressure-stall-information-for-io-not-show-as-100/527347#527347
I have seen the disclaimers for iowait in
Documentation/filesystems/proc.txt, and the derived man page.
Technically, the third disclaimer might cover anything. But I was
optimistic; I hoped it was talking about relatively small glitches :-).Â
I didn't think it would mean a large systematic undercounting, which
applied to the vast majority of current systems (which are not tuned for
realtime use).
|
- iowait: In a word, iowait stands for waiting for I/O to complete. But there
are several problems:
1. Cpu will not wait for I/O to complete, iowait is the time that a task is
waiting for I/O to complete. When cpu goes into idle state for
outstanding task io, another task will be scheduled on this CPU.
2. In a multi-core CPU, the task waiting for I/O to complete is not running
on any CPU, so the iowait of each CPU is difficult to calculate.
3. The value of iowait field in /proc/stat will decrease in certain
conditions|
Thanks for all the power-saving code
Alan