Re: [RFC] vmstat: Avoid waking up idle-cpu to service shepherd work

From: Peter Zijlstra
Date: Fri Mar 27 2015 - 05:16:28 EST


On Fri, Mar 27, 2015 at 10:19:54AM +0530, Viresh Kumar wrote:
> On 27 March 2015 at 01:48, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > Shouldn't this be viewed as a shortcoming of the core timer code?
>
> Yeah, it is. Some (not so pretty) solutions were tried earlier to fix that, but
> they are rejected for obviously reasons [1].
>
> > vmstat_shepherd() is merely rescheduling itself with
> > schedule_delayed_work(). That's a dead bog simple operation and if
> > it's producing suboptimal behaviour then we shouldn't be fixing it with
> > elaborate workarounds in the caller?
>
> I understand that, and that's why I sent it as an RFC to get the discussion
> started. Does anyone else have got another (acceptable) idea to get this
> resolved ?

So the issue seems to be that we need base->running_timer in order to
tell if a callback is running, right?

We could align the base on 8 bytes to gain an extra bit in the pointer
and use that bit to indicate the running state. Then these sites can
spin on that bit while we can change the actual base pointer.

Since the timer->base pointer is locked through the base->lock and
hand-over is safe vs lock_timer_base, this should all work.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/