Re: CPU Hotplug rework
From: Rusty Russell
Date: Mon Mar 26 2012 - 00:25:15 EST
On Fri, 23 Mar 2012 17:23:47 -0700, "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> On Sat, Mar 24, 2012 at 09:57:32AM +1030, Rusty Russell wrote:
> > On Thu, 22 Mar 2012 15:49:20 -0700, "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx> wrote:
> > > On Thu, Mar 22, 2012 at 02:55:04PM +1030, Rusty Russell wrote:
> > > > On Wed, 21 Mar 2012 10:01:59 +0100, Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:
> > > > > Thing is, if its really too much for some people, they can orchestrate
> > > > > it such that its not. Just move everybody in a cpuset, clear the to be
> > > > > offlined cpu from the cpuset's mask -- this will migrate everybody away.
> > > > > Then hotplug will find an empty runqueue and its fast, no?
> > > >
> > > > I like this solution better.
> > >
> > > As long as we have some way to handle kthreads that are algorithmically
> > > tied to a given CPU. There are coding conventions to handle this, for
> > > example, do everything with preemption disabled and just after each
> > > preempt_disable() verify that you are in fact running on the correct
> > > CPU, but it is easy to imagine improvements.
> >
> > I don't think we should move per-cpu kthreads at all. Let's stop trying
> > to save a few bytes of memory, and just leave them frozen. They'll run
> > again if/when the CPU returns.
>
> OK, that would work for me. So, how do I go about freezing RCU's
> per-CPU kthreads?
Good question.
Obviously, having callbacks hanging around until the CPU comes back is
not viable, nor is blocking preempt during the callbacks. Calling
get_online_cpus() is too heavy.
I can think of three approaches:
1) Put the being-processed rcu calls into a per-cpu var, and pull them
off that list with preempt disabled. This lets us cleanup after the
thread gets frozen as its CPU goes ofline, but doesn't solve the case
of going offline during a callback.
2) Sync with the thread somehow during a notifier callback. This is the
same kind of logic as shutting the thread down, so it's not really
attractive from a simplicity POV.
3) Create to a per-cpu rwsem to stop a specific CPU from going down, and
just grab that while we're processing rcu callbacks.
If this pattern of kthread is common, then #3 (or some equiv lightwieght
way of stopping a specific CPU from going offline) is looking
attractive.
Cheers,
Rusty.
--
How could I marry someone with more hair than me? http://baldalex.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/