Re: [PATCH V2] smp: Give WARN()ing when calling smp_call_function_many()/single()in serving irq

From: Lai Jiangshan
Date: Wed Feb 27 2013 - 09:50:51 EST


On Sat, Feb 16, 2013 at 10:10 PM, Chuansheng Liu
<chuansheng.liu@xxxxxxxxx> wrote:
> Currently the functions smp_call_function_many()/single() will
> give a WARN()ing only in the case of irqs_disabled(), but that
> check is not enough to guarantee execution of the SMP
> cross-calls.
>
> In many other cases such as softirq handling/interrupt handling,
> the two APIs still can not be called, just as the
> smp_call_function_many() comments say:
>
> * You must not call this function with disabled interrupts or from a
> * hardware interrupt handler or from a bottom half handler. Preemption
> * must be disabled when calling this function.
>
> There is a real case for softirq DEADLOCK case:
>
> CPUA CPUB
> spin_lock(&spinlock)
> Any irq coming, call the irq handler
> irq_exit()
> spin_lock_irq(&spinlock)
> <== Blocking here due to
> CPUB hold it
> __do_softirq()
> run_timer_softirq()
> timer_cb()
> call smp_call_function_many()
> send IPI interrupt to CPUA
> wait_csd()
>
> Then both CPUA and CPUB will be deadlocked here.
>
> So we should give a warning in the nmi, hardirq or softirq context as well.
>
> Moreover, adding one new macro in_serving_irq() which indicates
> we are processing nmi, hardirq or sofirq.

The code smells bad. in_serving_softirq() don't take spin_lock_bh() in account.

CPUA CPUB CPUC
spin_lock(&lockA)
Any irq coming, call
the irq handler
irq_exit()
spin_lock_irq(&lockA)
*Blocking* here
due to CPUB hold it spin_lock_bh(&lockB)
__do_softirq()
run_timer_softirq()
spin_lock_bh(&lockB)
*Blocking* heredue to
CPUC hold it
call
smp_call_function_many()
send IPI
interrupt to CPUA
wait_csd()
*Blocking* here.

So it is still deadlock. but your code does not warn it.
so in_softirq() is better than in_serving_softirq() in in_serving_irq(),
and results in_serving_irq() is the same as in_interrupt().

so please remove in_serving_irq() and use in_interrupt() instead.
And add:

Reviewed-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>

In the long-term, the best solution is using percpu lockdep for
local_irq_disable()
and smp_call_function_many():

CPUA CPUB
spin_lock(&lockA)
spin_lock_irq(&lockA)
*Blocking* here
due to CPUB hold it
call smp_call_function_many()
send IPI interrupt to CPUA
wait_csd()
*Blocking* here.

I will do it in the next week after the next week.

Thanks,
Lai


>
> Signed-off-by: liu chuansheng <chuansheng.liu@xxxxxxxxx>
> ---
> include/linux/hardirq.h | 5 +++++
> kernel/smp.c | 11 +++++++----
> 2 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> index 624ef3f..e07663f 100644
> --- a/include/linux/hardirq.h
> +++ b/include/linux/hardirq.h
> @@ -94,6 +94,11 @@
> */
> #define in_nmi() (preempt_count() & NMI_MASK)
>
> +/*
> + * Are we in nmi,irq context, or softirq context?
> + */
> +#define in_serving_irq() (in_nmi() || in_irq() || in_serving_softirq())
> +
> #if defined(CONFIG_PREEMPT_COUNT)
> # define PREEMPT_CHECK_OFFSET 1
> #else
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 69f38bd..b0a5d21 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -12,6 +12,7 @@
> #include <linux/gfp.h>
> #include <linux/smp.h>
> #include <linux/cpu.h>
> +#include <linux/hardirq.h>
>
> #include "smpboot.h"
>
> @@ -323,8 +324,9 @@ int smp_call_function_single(int cpu, smp_call_func_t func, void *info,
> * send smp call function interrupt to this cpu and as such deadlocks
> * can't happen.
> */
> - WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
> - && !oops_in_progress);
> + WARN_ON_ONCE(cpu_online(this_cpu)
> + && (irqs_disabled() || in_serving_irq())
> + && !oops_in_progress);
>
> if (cpu == this_cpu) {
> local_irq_save(flags);
> @@ -462,8 +464,9 @@ void smp_call_function_many(const struct cpumask *mask,
> * send smp call function interrupt to this cpu and as such deadlocks
> * can't happen.
> */
> - WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
> - && !oops_in_progress && !early_boot_irqs_disabled);
> + WARN_ON_ONCE(cpu_online(this_cpu)
> + && (irqs_disabled() || in_serving_irq())
> + && !oops_in_progress && !early_boot_irqs_disabled);
>
> /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */
> cpu = cpumask_first_and(mask, cpu_online_mask);
> --
> 1.7.0.4
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/