Re: Fw: Re: [mm PATCH 4/6] RCU: (now) CPU hotplug

From: Paul E. McKenney
Date: Sun Jan 28 2007 - 18:14:13 EST


On Fri, Jan 26, 2007 at 01:29:49PM -0800, Andrew Morton wrote:
> On Sat, 27 Jan 2007 02:14:06 +0530
> Dipankar Sarma <dipankar@xxxxxxxxxx> wrote:
>
> > On Fri, Jan 26, 2007 at 12:17:39PM -0800, Andrew Morton wrote:
> > > On Sat, 27 Jan 2007 01:16:22 +0530
> > > Dipankar Sarma <dipankar@xxxxxxxxxx> wrote:
> > > > > The plan is, I hope, to rip it all out and do freeze_processes() on the
> > > > > hotplug side, so nobody else needs to worry about cpu hotplug any more.
> > > > > But at present everyone seems to be in hiding.
> > > >
> > > > This would be ideal. However, we don't seem to have any momentum
> > > > on this.
> > >
> > > There's no point in expending effort on a fancy new lock until this option
> > > has been eliminated, so yeah, things are stuck.
> >
> > The new lock (scalable refcount) is almost already there.
> > This http://lkml.org/lkml/2006/10/26/65 can be used to implement
> > get_cpu_hotplug()/put_cpu_hotplug(). The unfairness issue
> > can be fixed. I am going to play with these patches and
> > see if I can come up with something useful quickly.
>
> You're forgetting the large, unknown number of places in the kernel which
> are presently buggy in the presence of CPU hotplug. With your proposal, we
> still need to hunt them all down and put magic locks around them, and we need to
> continue to do that as the kernel evolves.

Certainly the current practice of notifying CPU_UP_PREPARE and CPU_ONLINE
in the same order as CPU_DOWN_PREPARE and CPU_DEAD is just asking for lots
of bugs. The "up" notifications should proceed in the same order as the
services were initialized at boot time, and the "down" notifications need
to proceed in the opposite order.

As things stand, if service B relies on service A, either service B must
go without during the CPU-removal process or services A and B must know
way too much about one another. Either approach is an excellent way
to breed endless quantities of bugs. If "down" notifications instead
proceeded in the opposite order from boot-time initialization, such
dependencies would be resolved automatically, and life is -much- simpler.

So why are we doing hard things the hard way here??? ;-)

> If we use the process freezer, these bugs all get automatically fixed,
> and we get to remove the existing locking, and we don't need to think
> about it any more.

The idea being to essentially suspend the system to RAM, remove the
CPU and then unsuspend it? Seems like quite high overhead -- or am
I misunderstanding the proposal?

Thanx, Paul
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/