[PATCH 0/5] sched,time: fix irq time accounting with nohz_idle

From: riel
Date: Thu Jun 16 2016 - 12:06:42 EST


Currently irq time accounting only works in these cases:
1) purely ticke based accounting
2) nohz_full accounting, but only on housekeeping & nohz_full CPUs
3) architectures with native vtime accounting

On nohz_idle CPUs, which are probably the majority nowadays,
irq time accounting is currently broken. This leads to systems
reporting a dramatically lower amount of irq & softirq time than
is actually spent handling them, with all the time spent while the
system is in the idle task being accounted as idle.

This patch set seems to bring the amount of irq time reported by
top (and /proc/stat) roughly in line with that measured when I do
a "perf record -g -a" run to see what is using all that time.

The amount of irq time used, especially softirq, is shockingly high,
to the point of me thinking this patch set may be wrong, but the
numbers seem to match what perf is giving me...

These patches apply on top of Wanpeng Li's steal time patches.

CONFIG_IRQ_TIME_ACCOUNTING is now a config option that is available
as a separate choice from tick based / nohz_idle / nohz_full mode,
a suggested by Frederic Weisbecker.

Next up: look at the things that are using CPU time on an otherwise
idle system, and see if I can make those a little faster :)