Re: [PATCH 00/51] CPU hotplug: Fix issues with callback registration

From: Gautham R Shenoy
Date: Thu Feb 06 2014 - 07:14:34 EST


On Thu, Feb 06, 2014 at 04:34:33PM +0530, Srivatsa S. Bhat wrote:
> >
> > CPU_POST_DEAD notification, is invoked with the cpu_hotplug.lock
> > dropped. This was necessary for subsystems which would be waiting for
> > some other thread to finish some work, and that other thread could
> > invoke get_online_cpus(). If CPU_POST_DEAD notification were issued
> > without dropping the cpu_hotplug.lock, this would lead to a deadlock
> > as the notifier would be left stuck waiting for the thread which is
> > blocked in get_online_cpus().
> >
> > It was introduced to ensure that multithreaded workqueues can safely
> > use get_online_cpus() [https://lkml.org/lkml/2008/6/29/121].
> >
> > As of now, only two subsystems use this notification and workqueues is
> > _not_ one of them!
> > * arch/x86/kernel/cpu/mcheck/mce.c:mce_cpu_callback()
> > * drivers/cpufreq/cpufreq.c:cpufreq_cpu_callback()
> > I haven't yet audited these two cases to see if they really need this
> > to be handled in CPU_POST_DEAD or if they can be handled in CPU_DEAD.
> >
>
> Well, cpufreq had a legitimate need to use POST_DEAD to avoid the
> deadlock described in commit 1aee40ac9c. However, there had been some
> discussion some time ago about reorganizing the cpufreq's hotplug callback
> so as to move most (but not all) of its work outside of POST_DEAD [1].
> But as it stands, I don't think it would be easy to totally get rid of
> cpufreq's dependence on the POST_DEAD notifier.
>

Right, I see the reason why cpufreq needs POST_DEAD.

> Besides, I think its good to retain the POST_DEAD notifier option in
> the CPU hotplug core code. It has come handy several times to fix hard
> deadlock issues.
>

I know. I am not denying the usefulness of POST_DEAD. But the fact
that some of the CPU_* notifiers are invoked with the cpu_hotplug.lock
held while CPU_POST_DEAD is invoked with the lock dropped looks a bit
asymmetrical. At the moment I cannot think of a simpler alternative.


> > Also can we have an alternate API, something like
> > cpu_hotplug_register_begin/end() instead of reusing
> > cpu_maps_update_begin/end() for this usage, since in most of the
> > patches that follow, we're not touching the any of the cpu_*_maps!
> >
>
> Yes, the function names cpu_maps_update_begin/end() don't really suit
> the kind of usage I'm proposing in this patchset, and hence is kind of
> a misnomer. For better readability, I'm thinking of defining a macro
> such as say, cpu_hotplug_notifier_lock()/unlock() that redirects to
> cpu_maps_update_begin/end() respectively. That way, we can export just
> those former symbols for use by modules, and thereby the code would look
> more intuitive, like this:
>
> cpu_hotplug_notifier_lock();
>
> for_each_online_cpu(cpu)
> init_cpu(cpu);
>
> /* This doesn't take the cpu_add_remove_lock */
> __register_cpu_notifier(&foobar_cpu_notifier);
>
> cpu_hotplug_notifier_unlock();
>
> What do you think?

Sounds good.
>
> Regards,
> Srivatsa S. Bhat
>

Thanks and Regards
gautham.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/