Re: [PATCH 6/7] smp: Don't yell about IRQs disabled in kgdb_roundup_cpus()

From: Peter Zijlstra
Date: Tue Oct 30 2018 - 11:33:57 EST


On Mon, Oct 29, 2018 at 11:07:06AM -0700, Douglas Anderson wrote:
> In kgdb_roundup_cpus() we've got code that looks like:
> local_irq_enable();
> smp_call_function(kgdb_call_nmi_hook, NULL, 0);
> local_irq_disable();

> Let's add kgdb to the list of reasons not to warn in
> smp_call_function_many(). That will allow us (in a future patch) to
> stop calling local_irq_enable() which will get rid of the original
> splat.
>
> NOTE: with this change comes the obvious question: will we start
> deadlocking more often now when we drop into the debugger. I can't
> say that for sure one way or the other, but the fact that we do the
> same logic for "oops_in_progress" makes me feel like it shouldn't
> happen too often. Also note that the old logic of turning on
> interrupts temporarily wasn't exactly safe since (I presume) that
> could have violated spin_lock_irqsave() semantics and ended up with a
> deadlock of its own.

How is any of that not utterly and terminally broken?

> @@ -413,7 +414,8 @@ void smp_call_function_many(const struct cpumask *mask,
> * can't happen.
> */
> WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled()
> - && !oops_in_progress && !early_boot_irqs_disabled);
> + && !oops_in_progress && !early_boot_irqs_disabled
> + && !in_dbg_master());
>
> /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */
> cpu = cpumask_first_and(mask, cpu_online_mask);

Not a fan of this. There is a distinct difference between
oops_in_progress and dropping into kgdb in that you don't ever expect to
return/survive oopses, whereas we do expect to survive kgdb.

Also, how does kgdb work at all without actual NMIs ?

So no, NAK on this.