Re: 50 Watt idle power regression bisected to Linux-3.10

From: Peter Zijlstra
Date: Thu Dec 12 2013 - 03:52:10 EST


On Wed, Dec 11, 2013 at 03:08:35PM -0800, H. Peter Anvin wrote:
> On 12/11/2013 09:50 AM, Ingo Molnar wrote:
> >
> > Well, availability could be a problem too, if some CPU (real or
> > virtual) implements MWAIT but not CLFLUSH.
> >
> > In theory we could make mwait an alternatives variant and patch in the
> > right combination of instructions? The CLFLUSH goes to the same
> > address as on which the monitoring happens, so it could be considered
> > one meta-instruction.
> >
>
> The first thing to do is probably to drop the use of thread_info as a
> wakeup doorbell. It seemed like a good idea at the time -- after all,
> there is one for each thread -- but it is extremely likely to be dirty
> in the cache, which is (presumably) what causes these kinds of bugs to
> be maximally likely. Even if we don't do the CLFLUSH it is likely that
> the hardware has to do something expensive behind the scenes.
>
> So I would like to propose that we switch to using a percpu variable
> which is a single cache line of nothing at all. It would only ever be
> touched by MONITOR and for explicit wakeup. Hopefully that will resolve
> this problem without the need for the CLFLUSH.

The reason we use thread_info::flags is because we need to write
TIF_NEED_RESCHED into it to wake up anyhow.

Using another cacheline would mean the wakeup path would need to write a
second cross cpu cacheline -- that is badness too.

So no, I don't think we want to listen to another line.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/