Re: [PATCH 0/6] x86/cpu hotplug: Wake up offline CPU via mwait or nmi

From: Rusty Russell
Date: Mon Jun 04 2012 - 21:07:14 EST


On Mon, 04 Jun 2012 22:33:21 +0200, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Mon, 2012-06-04 at 22:11 +0200, Thomas Gleixner wrote:
>
> > I understand what you are trying to do, though I completely disagree
> > with the solution.
> >
> > The main problem of the current hotplug code is that it is an all or
> > nothing approach. You have to tear down the whole thing completely
> > instead of just taking it out of the usable set of cpus.
> >
> > I'm working on a proper state machine driven online/offline sequence,
> > where you can put the cpu into an intermediate state which avoids
> > bringing it down completely. This is enough to get the full
> > powersaving benefits w/o having to go through all the synchronization
> > states of a full online/offline. That will shorten the onlining time
> > of an previously offlined cpu to almost nothing.
> >
> > I really want to avoid adding more bandaids to the hotplug code before
> > we have sorted out the existing horror.
>
> Its far worse.. you shouldn't _ever_ care about hotplug latency unless
> you've got absolutely braindead hardware. We all now ARM has been
> particularly creative here, but is Intel now trying to trump ARM at
> stupid?

I disagree. Deactivating a cpu for power saving is halfway to hotplug
anyway. I'd rather unify the two cases, where we can specify how dead a
CPU should be, than have individual archs and boards do random hacks.

It also gives us a great excuse to audit and neaten various of the
hotplug cpu callbacks; most of the ones I've looked at have been racy :(

The ones which simply want to keep per-cpu stats can be given a nice
helper with two simple callbacks: one to empty stats for a going-away
cpu, and (maybe) one to restore them.

The per-cpu kthreads should no longer get torn down and recreated, and
doing it via a separate notifier function is ugly and error-prone. My
plan is a "bool kthread_cpu_going(void)" and then a "void
kthread_cpu_can_go(void)", so kthreads can do:

if (kthread_cpu_going()) {
/* Do any cleanup we need. */
...

/* This returns when CPU comes back. */
kthread_cpu_can_go();
}

Yeah, we should probably have the kthread exit inside
kthread_cpu_can_go() if they stop the kthread, but that's a detail.

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/