Re: [PATCH 09/32] nohz: Move ts->idle_calls into strict idle logic

From: Frederic Weisbecker
Date: Tue Sep 06 2011 - 12:35:48 EST

Next message: Konrad Rzeszutek Wilk: "Re: [Xen-devel] Re: [Revert] Re: [PATCH] mm: sync vmalloc addressspace page tables in alloc_vm_area()"
Previous message: Tejun Heo: "Re: sata_mv port lockup on hotplug (kernel 2.6.38.2)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Aug 30, 2011 at 05:33:42PM +0200, Peter Zijlstra wrote:
> On Tue, 2011-08-30 at 16:45 +0200, Frederic Weisbecker wrote:
> > On Mon, Aug 29, 2011 at 08:33:23PM +0200, Peter Zijlstra wrote:
> > > On Mon, 2011-08-29 at 20:23 +0200, Frederic Weisbecker wrote:
> > >
> > > > > Well, no, on interrupt return you shouldn't do anything. If you've
> > > > > stopped the tick it stays stopped until you do something that needs it,
> > > > > then that action will re-enable it.
> > > >
> > > > Sure, when something needs the tick in this mode, we usually
> > > > receive an IPI and restart the tick from there but then
> > > > tick_nohz_stop_sched_tick() handles the cases with *needs_cpu()
> > > > very well on interrupt return (our IPI return) by doing a kind
> > > > of "light" HZ mode by logically switching to nohz mode but
> > > > with the next timer happening in HZ, assuming it's a matter
> > > > of one tick and we will switch to a real nohz behaviour soon.
> > > >
> > > > I don't see a good reason to duplicate that logic with a pure
> > > > restart from the IPI.
> > >
> > > That sounds like an optimization, and should thus be done later.
> >
> > The optimization is already there upstream. I can split the logic for
> > non-idle case but I'm not sure about the point of that.
>
> care to point me to the relevant code, because I can't remember a
> half-assed nohz state.. then again, maybe I didn't look hard enough.

It's in tick_nohz_stop_sched_tick():

if (rcu_needs_cpu(cpu) || printk_needs_cpu(cpu) ||
arch_needs_cpu(cpu)) {
next_jiffies = last_jiffies + 1;
delta_jiffies = 1;
} else {
/* Get the next timer wheel timer */
next_jiffies = get_next_timer_interrupt(last_jiffies);
delta_jiffies = next_jiffies - last_jiffies;
}

There are two cases:

- tick_nohz_stop_sched_tick() is called from idle. If one of the *_needs_cpu()
is true, then the tick is not stopped. It doesn't even enter the nohz logic.

- tick_nohz_stop_sched_tick() is called from an interrupt exit while we are
idle. If one of the *_needs_cpu() is true, then if the tick was previously
stopped, we program the next timer to be in 1 jiffy but without exiting
nohz mode. Otherwise if the tick was disabled, then we don't stop the tick.

In fact those *_needs_cpu() are treated like timer list timer or hrtimers
that happen in one jiffy from now.

When we are in such a scheme:

tick_nohz_stop_sched_tick();
while (!need_resched())
hlt();
tick_nohz_restart_sched_tick();

It avoids to go back and forth between hz/nohz logic from inside
the loop.

> > > > > > That said I wonder if some of the above conditions should restore a periodic
> > > > > > behaviour on interrupt return...
> > > > >
> > > > > I would expect the tick not to be stopped when tick_nohz_can_stop_tick()
> > > > > returns false. If it returns true, then I expect anything that needs it
> > > > > to re-enable it.
> > > > >
> > > >
> > > > Yeah. In the case of need_resched() in idle I believe the CPU doesn't
> > > > really go to sleep later so it should be fine. But for the case of
> > > > softirq pending or nohz_mode, I'm not sure...
> > >
> > > softirqs shouldn't be pending when you go into nohz mode..
> >
> > You mean it can't happen or we don't want that to happen?
>
> We don't want that.. going into nohz with pending softirqs means the
> softirqs will be delayed for an unknown amount of time, this should not
> occur.
>
> tick_nohz_stop_sched_tick() has:
>
> if (unlikely(local_softirq_pending() && cpu_online(cpu))) {
> static int ratelimit;
>
> if (ratelimit < 10) {
> printk(KERN_ERR "NOHZ: local_softirq_pending %02x\n",
> (unsigned int) local_softirq_pending());
> ratelimit++;
> }
> goto end;
> }
>
> which should warn us if this ever was to occur.

Yeah. Moreover that check is weird because if softirqs are pending
at that time, then the softirqd thread should be woken up. Because
either we come from idle and we are not in an interrupt (softirq
raised would use softirqd in not interrupt exit) or we come from
an interrupt and the softirqs have been executed. If there were
too much of them, softirqd would have been woken up as well.

Thus the need_resched() condition that comes before should have exited the
function already.

> > >
> > > That is, I'm really not seeing what's wrong with the very simple:
> > >
> > >
> > > if (tick_nohz_can_stop_tick())
> > > tick_nohz_stop_tick();
> > >
> > >
> > > and relying on everybody who invalidates tick_nohz_can_stop_tick(), to
> > > do:
> > >
> > > tick_nohz_start_tick();
> >
> > May be for the non-idle case. But for the idle case I need to ensure
> > this is necessary somewhere.
>
> How exactly do idle and non-idle differ? Its about stopping the tick,
> regardless of if we're idle or not if someone needs the thing we need to
> start it again.

It may differ because of the way the idle loop is made, considering some
things as 1 jiffy timer to avoid a hard whole tick restart.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Konrad Rzeszutek Wilk: "Re: [Xen-devel] Re: [Revert] Re: [PATCH] mm: sync vmalloc addressspace page tables in alloc_vm_area()"
Previous message: Tejun Heo: "Re: sata_mv port lockup on hotplug (kernel 2.6.38.2)"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]