Re: [RFC] vmstat: Avoid waking up idle-cpu to service shepherd work

From: Peter Zijlstra
Date: Fri Mar 27 2015 - 05:30:39 EST


On Fri, Mar 27, 2015 at 10:16:13AM +0100, Peter Zijlstra wrote:
> On Fri, Mar 27, 2015 at 10:19:54AM +0530, Viresh Kumar wrote:
> > On 27 March 2015 at 01:48, Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> wrote:
> > > Shouldn't this be viewed as a shortcoming of the core timer code?
> >
> > Yeah, it is. Some (not so pretty) solutions were tried earlier to fix that, but
> > they are rejected for obviously reasons [1].
> >
> > > vmstat_shepherd() is merely rescheduling itself with
> > > schedule_delayed_work(). That's a dead bog simple operation and if
> > > it's producing suboptimal behaviour then we shouldn't be fixing it with
> > > elaborate workarounds in the caller?
> >
> > I understand that, and that's why I sent it as an RFC to get the discussion
> > started. Does anyone else have got another (acceptable) idea to get this
> > resolved ?
>
> So the issue seems to be that we need base->running_timer in order to
> tell if a callback is running, right?
>
> We could align the base on 8 bytes to gain an extra bit in the pointer
> and use that bit to indicate the running state. Then these sites can
> spin on that bit while we can change the actual base pointer.

Even though tvec_base has ____cacheline_aligned stuck on, most are
allocated using kzalloc_node() which does not actually respect that but
already guarantees a minimum u64 alignment, so I think we can use that
third bit without too much magic.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/