Re: [RFC] IRQ handlers run with some high-priority interrupts(not NMI) enabled on some platform

From: Arnd Bergmann
Date: Fri Feb 12 2021 - 17:48:37 EST


On Fri, Feb 12, 2021 at 2:18 AM Song Bao Hua (Barry Song)
<song.bao.hua@xxxxxxxxxxxxx> wrote:

> So I am requesting comments on:
> 1. are we expecting all interrupts except NMI to be disabled in irq handler,
> or do we actually allow some high-priority interrupts between low and NMI to
> come in some platforms?

I tried to come to an answer but this does not seem particularly well-defined.
There are a few things I noticed:

- going through the local_irq_save()/restore() implementations on all
architectures, I did not find any other ones besides m68k that leave
high-priority interrupts enabled. I did see that at least alpha and openrisc
are designed to support that in hardware, but the code just leaves the
interrupts disabled.

- The generic code is clearly prepared to handle nested hardirqs, and
the irq_enter()/irq_exit() functions have a counter in preempt_count
for the nesting level, using a 4-bit number for hardirq, plus another
4-bit number for NMI.

- There are a couple of (ancient) drivers that enable interrupts in their
interrupt handlers, see the four callers of local_irq_enable_in_hardirq()
(all in the old drivers/ide stack) and arch/ia64/kernel/time.c, which
enables interupts in its timer function (I recently tried removing this
and my patch broke ia64 timers, but I'm not sure if the cause was
the local_irq_enable() or something else).

- The local_irq_enable_in_hardirq() function itself turns into a nop
when lockdep is enabled, since d7e9629de051 ("[PATCH] lockdep:
add local_irq_enable_in_hardirq() API"). According to the comment
in there, lockdep already enforces the behavior you suggest. Note that
lockdep support is missing on m68k (and also alpha, h8300, ia64, nios2,
and parisc).

> 2. If either side is true, I think we need to document it somewhere as there
> is always confusion about this.
>
> Personally, I would expect all interrupts to be disabled and I like the way
> of ARM64 to only use high-priority interrupt as pseudo NMI:
> https://lwn.net/Articles/755906/
> Though Finn argued that this will contribute to lose hardware feature of m68k.

Regardless of what is documented, I would argue that any platform
that relies on this is at the minimum doing something risky because at
the minimum this runs into hard to debug code paths that are not
exercised on any of the common architectures.

Arnd