Re: Have we changed /proc/stat idle statistics by NOHZ kernel?

From: Andrew Morton
Date: Mon Aug 01 2011 - 16:00:22 EST


On Mon, 25 Jul 2011 16:33:13 +0200
Michal Hocko <mhocko@xxxxxxx> wrote:

> Hi,
> we have a customer reporting that /proc/stat doesn't provide correct
> results about idle time if the machine is idle.
> The issue is caused by the fact that tickles kernel doesn't update
> kstat_cpu(i).cpustat.idle while it is tickles. Tools that parse this
> file interpret the unchanged value as 0% idle since the last time.
> While I personally do not think that measuring the idle machine is
> that important one could say that the semantic of the file has changed
> with NOHZ which is not good as we are trying to keep this interface
> stable.
> One way to fix this is to consider the current status of idle in
> show_stat. The very primitive attempt of that can be seen bellow (on
> top of the current Linus tree). I know it has several issue it just
> illustrates what I am trying to say. It will not work if jiffies
> overflow while the CPU was tickles and it also misses locking and
> handling !NOHZ configuration.
>
> I have also noticed we have get_cpu_idle_time_us which should do
> something similar. Should it be used instead or it is more intrusive?
>
> Btw. is this considered to be a problem at all?
>

I'd consider it a bug and a regression. If the machine was idle and
/proc/stat says "zero idle time" then that is simply incorrect.

Can we just cheat? subtract elapsed R and D time from elapsed wall
time and print that out?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/