deadlock between cpu_stopper & native_flush_tlb_others()->smp_call_function_many()

From: Igor Mammedov
Date: Mon Mar 03 2014 - 11:44:34 EST


It looks like I hit a deadlock between smp_call_function_many() and
cpu_stopper threads.

Where smp_call_function_many() on CPU1 called from
native_flush_tlb_others() waits on call to be complete on
CPU2 while CPU2 waits on state synchronization in
multi_cpu_stop() which can't be completed until stop work
queued on CPU1 is completed, which can't be done since CPU1
is busy looping in smp_call_function_many().


CPU1 CPU2
stop_machine()
queue stop work on cpu 1&2

native_flush_tlb_others()
smp_call_function_many()
...
---------------------------------------------------------
cpu_stopper_thread()
multi_cpu_stop()
do {
...
msdata->state == MULTI_STOP_PREPARE
msdata->active_cpus == 0110
msdata->thread_ack == 1
} while (curstate != MULTI_STOP_EXIT)
waiting until CPU1 ACKs state, i.e. thread_ack == 0
---------------------------------------------------------
...
if (wait) {
for_cpu(0110) {
csd_lock_wait(csd);
waiting until call on CPU2 is completed

Are there any suggestions on how to fix this nicely?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/