Re: [GIT PULL] cputime patch for 2.6.30-rc6

From: Martin Schwidefsky
Date: Mon May 25 2009 - 07:07:50 EST

On Wed, 20 May 2009 09:44:33 +0100 (BST)
Michael Abbott <michael@xxxxxxxxxxxxxxx> wrote:

> On Wed, 20 May 2009, Martin Schwidefsky wrote:
> > On Tue, 19 May 2009 11:31:28 +0200 Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > > On Tue, 2009-05-19 at 11:00 +0200, Martin Schwidefsky wrote:
> > > > I don't see a problem here. In an idle multiple cpu system there IS
> > > > more idle time than elapsed time. What would makes sense is to
> > > > compare elapsed time * #cpus with the idle time. But then there is
> > > > cpu hotplug which forces you to look at the delta of two measuring
> > > > points where the number of cpus did not change.
> >
> > Well, we better distinguish between the semantical problem and the
> > performance consideration, no? One thing is what the proc field is
> > supposed to contain, the other is how fast you can do it.
> >
> > I have been refering to the semantical problem, but your point with the
> > performance is very valid as well. So
> >
> > 1) are we agreed that the second field of /proc/uptime should contain
> > the aggregate idle time of all cpus?
> I think that this very simple node (only two fields, let me call them
> uptime and idle) should be amenable to simple interpretation. In
> particular, I'd like to be able to compute
> busy = uptime - idle

On a single cpu system this works. On an SMP with cpu hotplug it breaks.

> In other words, idle should be in the same units of time as uptime, which
> is to say it should be in units of elapsed time, not aggregate CPU time.

They are, no? The idle time is in units of cputime but what is measured
is wall-clock time. In fact the idle time is the result of the
idle = wall_clock - user - nice - system -
iowait - irq - softirq - steal - guest
Trouble is that this is fundamentally per cpu. Just showing a single
cpu is kaputt, showing the sum of all cpus divided by the number of
cpus has problem with cpu hotplug. That leaves the simple approach to
just do the sum over all cpus.

> Thus, the patch to /proc/uptime that we're discussing has, in the view I'm
> trying to argue, replaced a clearly broken idle field with a more subtly
> broken version.

What is the least broken version?

> The strongest point in my case is this: the first field is in units of
> elapsed wall clock time, and therefore the second should also be. Of
> course, established practice here is also important, but there is some
> ambiguity in the man pages I'm looking at -- this field is described in
> proc(5) as:
> the amount of time spent in idle process (seconds)
> Not aggregate CPU time, but elapsed time, I think is meant here, but the
> wording is ambiguous. Grepping for uptime in Documentation doesn't come
> up with anything clear either, but I can quote old (2.6.9) precedent
> (don't have a more recent multi-core system to hand).

In my view of the world the idle field IS wall clock. That is how we
calculate it on s390.

blue skies,

"Reality continues to ruin my life." - Calvin.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at