Re: [PATCH 0/4] CPU hotplug, cpusets: Fix CPU online handlingrelated to cpusets

From: Paul E. McKenney
Date: Sat Feb 11 2012 - 11:00:56 EST


On Fri, Feb 10, 2012 at 06:34:04PM +0100, Peter Zijlstra wrote:
> On Fri, 2012-02-10 at 08:53 -0800, Paul E. McKenney wrote:
> > On Fri, Feb 10, 2012 at 04:52:07PM +0100, Peter Zijlstra wrote:
> > > On Thu, 2012-02-09 at 16:11 +0100, Ingo Molnar wrote:
> > >
> > > > > My understanding of the code is that when a CPU is taken
> > > > > offline, it is removed from all the cpusets and then the
> > > > > scan_for_empty_cpusets() function is run to move tasks from
> > > > > empty cpusets to their parent cpusets.
> > > >
> > > > Why is that done that way? offlining a CPU should be an
> > > > invariant as far as cpusets are concerned.
> > >
> > > Can't, tasks need to run someplace. There's two choices, add a still
> > > online cpu to the now empty cpuset or move the tasks to a parent that
> > > still has online cpus.
> > >
> > > Both are destructive.
> >
> > OK, I will ask the stupid question... Hey, somebody has to! ;-)
> >
> > Would it make sense for offlining the last CPU in a cpuset to be
> > destructive, but to allow offlining of a non-last CPU to be reversible?
>
> No, that's very inconsistent and will lead to way more 'surprises'.

It might well lead to surprises, but so does INT_MIN==-INT_MIN. IOW,
the inconsistency certainly is a disadvantage, but it must be weighed
against the disadvantages of the current situation.

> > /me ducks. ;-)
>
> /me quacks ;-)
>
> Now the whole problem here seems to be that suspend uses cpu-hotplug to
> reduce the machine to UP -- I've no clue why it does that but I can
> imagine its because the BIOS calls only work on CPU0 and/or the resume
> only wakes CPU0 so you have to bootstrap the SMP thing again..
>
> Some suspend person wanna clarify? Rafael?
>
> Anyway, the whole suspend case is magic anyway since all tasks will have
> been frozen, so we could simply leave all of cpuset alone and ignore the
> hotplug notifier on CPU_TASKS_FROZEN callbacks, hmm?
>
> Do we unfreeze after we bring up the machine again?

Agreed, the suspend case is the highest priority in that losing your cpusets
after suspending and resuming is -very- surprising. ;-)

Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/