RE: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform

From: Song Bao Hua (Barry Song)
Date: Sat Feb 13 2021 - 17:26:14 EST




> -----Original Message-----
> From: Arnd Bergmann [mailto:arnd@xxxxxxxxxx]
> Sent: Sunday, February 14, 2021 5:32 AM
> To: Song Bao Hua (Barry Song) <song.bao.hua@xxxxxxxxxxxxx>
> Cc: tglx@xxxxxxxxxxxxx; gregkh@xxxxxxxxxxxxxxxxxxx; arnd@xxxxxxxx;
> geert@xxxxxxxxxxxxxx; funaho@xxxxxxxxx; philb@xxxxxxx; corbet@xxxxxxx;
> mingo@xxxxxxxxxx; linux-m68k@xxxxxxxxxxxxxxxxxxxx;
> fthain@xxxxxxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI)
> enabled on some platform
>
> On Sat, Feb 13, 2021 at 12:50 AM Song Bao Hua (Barry Song)
> <song.bao.hua@xxxxxxxxxxxxx> wrote:
>
> > So I was actually trying to warn this unusual case - interrupts
> > get nested while both in_hardirq() and irqs_disabled() are true.
> >
> > diff --git a/include/linux/hardirq.h b/include/linux/hardirq.h
> > index 7c9d6a2d7e90..b8ca27555c76 100644
> > --- a/include/linux/hardirq.h
> > +++ b/include/linux/hardirq.h
> > @@ -32,6 +32,7 @@ static __always_inline void rcu_irq_enter_check_tick(void)
> > */
> > #define __irq_enter() \
> > do { \
> > + WARN_ONCE(in_hardirq() && irqs_disabled(), "nested
> > interrupts\n"); \
> > preempt_count_add(HARDIRQ_OFFSET); \
>
> That seems to be a rather heavyweight change in a critical path.
>
> A more useful change might be to implement lockdep support for m68k
> and see if that warns about any actual problems. I'm not sure
> what is actually missing for that, but these are the commits that
> added it for other architectures in the past:
>
> 3c4697982982 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT")
> 000591f1ca33 ("csky: Enable LOCKDEP_SUPPORT")
> 78cdfb5cf15e ("openrisc: enable LOCKDEP_SUPPORT and irqflags tracing")
> 8f371c752154 ("xtensa: enable lockdep support")
> bf2d80966890 ("microblaze: Lockdep support")
>

Yes. M68k lacks lockdep support which might be added.

> > And I also think it is better for m68k's arch_irqs_disabled() to
> > return true only when both low and high priority interrupts are
> > disabled rather than try to mute this warn in genirq by a weaker
> > condition:
> > if (WARN_ONCE(!irqs_disabled(),"irq %u handler %pS enabled
> interrupts\n",
> > irq, action->handler))
> > local_irq_disable();
> > }
> >
> > This warn is not activated on m68k because its arch_irqs_disabled() return
> > true though its high-priority interrupts are still enabled.
>
> Then it would just end up always warning when a nested hardirq happens,
> right? That seems no different to dropping support for nested hardirqs
> on m68k altogether, which of course is what you suggested already.

This won't end up a warning on other architectures like arm,arm64, x86 etc
as interrupts won't come while arch_irqs_disabled() is true in hardIRQ.
For example, I_BIT of CPSR of ARM is set:
static inline int arch_irqs_disabled_flags(unsigned long flags)
{
return flags & IRQMASK_I_BIT;
}

So it would only give a backtrace on platforms whose arch_irqs_disabled()
return true while only some interrupts are disabled and some others
are still open, thus nested interrupts can come without any explicit
code to enable interrupts.

This warn seems to give consistent interpretation on what's "Run irq
handlers with interrupts disabled" in commit e58aa3d2d0cc (" genirq:
Run irq handlers with interrupts disabled")

>
> Arnd

Thanks
Barry