Re: [PATCH 00/51] CPU hotplug: Fix issues with callback registration
From: Srivatsa S. Bhat
Date: Thu Feb 06 2014 - 06:10:53 EST
On 02/06/2014 03:08 PM, Gautham R Shenoy wrote:
> Hi,
>
> On Thu, Feb 06, 2014 at 03:34:36AM +0530, Srivatsa S. Bhat wrote:
>> Hi,
>>
>>
>> To solve these issues and provide a race-free method to register CPU hotplug
>> callbacks, this patchset introduces new variants of the callback registration
>> APIs that don't hold the cpu_add_remove_lock, and exports the
>> cpu_add_remove_lock via cpu_maps_update_begin/done() for use by various
>> subsystems. With this in place, the following code snippet will register a
>> hotplug callback as well as initialize already online CPUs without any race
>> conditions.
>>
>> cpu_maps_update_begin();
>>
>> for_each_online_cpu(cpu)
>> init_cpu(cpu);
>>
>> /* This doesn't take the cpu_add_remove_lock */
>> __register_cpu_notifier(&foobar_cpu_notifier);
>>
>> cpu_maps_update_done();
>>
>
> Couple of comments:
>
> Right now, cpu_add_remove_lock is being used to
> 1) Serialize the cpu-hotplug writers.
>
> 2) Serialize accesses to cpu_present/possible_map.
>
> 3) Serialize updates to the cpu_chain (the cpu hotplug notifier chain).
>
> - This is necessary to ensure that registration of notifiers and
> invocation of CPU_POST_DEAD notifications don't race with each
> other. Else we could have used get/put_online_cpus() in
> register_cpu_notifier() and this patch series wouldn't have been
> necessary.
>
> 4) Bulk cpu-hotplug (disable/enable_non_boot_cpus), but this is a
> special case of 1).
>
> CPU_POST_DEAD notification, is invoked with the cpu_hotplug.lock
> dropped. This was necessary for subsystems which would be waiting for
> some other thread to finish some work, and that other thread could
> invoke get_online_cpus(). If CPU_POST_DEAD notification were issued
> without dropping the cpu_hotplug.lock, this would lead to a deadlock
> as the notifier would be left stuck waiting for the thread which is
> blocked in get_online_cpus().
>
> It was introduced to ensure that multithreaded workqueues can safely
> use get_online_cpus() [https://lkml.org/lkml/2008/6/29/121].
>
> As of now, only two subsystems use this notification and workqueues is
> _not_ one of them!
> * arch/x86/kernel/cpu/mcheck/mce.c:mce_cpu_callback()
> * drivers/cpufreq/cpufreq.c:cpufreq_cpu_callback()
> I haven't yet audited these two cases to see if they really need this
> to be handled in CPU_POST_DEAD or if they can be handled in CPU_DEAD.
>
Well, cpufreq had a legitimate need to use POST_DEAD to avoid the
deadlock described in commit 1aee40ac9c. However, there had been some
discussion some time ago about reorganizing the cpufreq's hotplug callback
so as to move most (but not all) of its work outside of POST_DEAD [1].
But as it stands, I don't think it would be easy to totally get rid of
cpufreq's dependence on the POST_DEAD notifier.
Besides, I think its good to retain the POST_DEAD notifier option in
the CPU hotplug core code. It has come handy several times to fix hard
deadlock issues.
> Also can we have an alternate API, something like
> cpu_hotplug_register_begin/end() instead of reusing
> cpu_maps_update_begin/end() for this usage, since in most of the
> patches that follow, we're not touching the any of the cpu_*_maps!
>
Yes, the function names cpu_maps_update_begin/end() don't really suit
the kind of usage I'm proposing in this patchset, and hence is kind of
a misnomer. For better readability, I'm thinking of defining a macro
such as say, cpu_hotplug_notifier_lock()/unlock() that redirects to
cpu_maps_update_begin/end() respectively. That way, we can export just
those former symbols for use by modules, and thereby the code would look
more intuitive, like this:
cpu_hotplug_notifier_lock();
for_each_online_cpu(cpu)
init_cpu(cpu);
/* This doesn't take the cpu_add_remove_lock */
__register_cpu_notifier(&foobar_cpu_notifier);
cpu_hotplug_notifier_unlock();
What do you think?
Regards,
Srivatsa S. Bhat
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/