Re: [RFC PATCH 10/15] nohz_task: Enter in extended quiescent statewhen in userspace

From: Peter Zijlstra
Date: Tue Dec 21 2010 - 03:05:16 EST


On Tue, 2010-12-21 at 02:27 +0100, Frederic Weisbecker wrote:
> On Mon, Dec 20, 2010 at 05:18:40PM +0100, Peter Zijlstra wrote:
> > On Mon, 2010-12-20 at 16:24 +0100, Frederic Weisbecker wrote:
> > > (we check if the local cpu is in nohz mode, which means
> > > no other task compete on that CPU)
> >
> > You keep repeating that definition, but its not true.. It means there is
> > not work for the tick to do, the tick does _tons_ more besides
> > preemption, so nr_running==1 is necessary but not sufficient.
>
> Sure but the point is that if the tick is not running, it means
> that the nohz task is the only task running on that CPU.

No, that too isn't true, the cpu could be idle.

> Now indeed there are other reasons for the tick to restart like RCU
> or the effective nohz mode to physically happen or to be delayed,
> which is decided by tick_nohz_stop_sched_tick().

You really really really badly need to read through the whole tick path
and look at all the things it does, put them in a list, then look at the
current nohz code, mark those it deals with, then go through the nohz
code again and find the nr_running==0 assumptions, then make sure you've
covered everything.

I'm very confident you'll find a number of things to fix. At the very
least your current patch set totally forgets about task runtime
accounting (account_process_tick()), the existing NO_HZ doesn't need to
worry about that because the system is idle so all it needs to do is add
idle ticks when it wakes up (and possibly steal time for the virt muck).

You also miss the profile_tick(), and you need to go through the load
accounting muck (both of them) to see if there are any nr_running==0
assumptions there.

You also need to fix the perf counter list rotation stuff, and again,
check if no_hz load-balancing can deal with the nr_running==1 situation.

Now, there might be more, this is just a quick one-minute scan through
the tick code, but the fact that nowhere in this whole patch-set is even
a mention of these things makes me worry about your whole approach.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/