Re: MSI irqchip configured as IRQCHIP_ONESHOT_SAFE causes spurious IRQs

From: Ramon Fried
Date: Wed Jan 15 2020 - 19:19:56 EST


On Wed, Jan 15, 2020 at 12:54 AM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>
> Ramon Fried <rfried.dev@xxxxxxxxx> writes:
> > On Tue, Jan 14, 2020 at 11:38 PM Ramon Fried <rfried.dev@xxxxxxxxx> wrote:
> >> On Tue, Jan 14, 2020 at 2:15 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
> >> > Ramon Fried <rfried.dev@xxxxxxxxx> writes:
> >> > > Besides the side effect of that, I don't really understand the logic
> >> > > of not masking the MSI until the threaded handler is complete,
> >> > > especially when there's no HW handler and only threaded handler.
> >> >
> >> > What's wrong with having another interrupt firing while the threaded
> >> > handler is running? Nothing, really. It actually can be desired because
> >> > the threaded handler is allowed to sleep.
> >> >
> >> What do you mean, isn't it the purpose IRQ masking ? Interrupt
> >> coalescing is done to mitigate these IRQ's, these HW interrupts just
> >> consume CPU cycles and don't do anything useful (scheduling an
> >> already scheduled thread).
>
> Again, that depends on your POV. It's a perfectly valid scenario to have
> another HW irq coming in preventing the thread to go to sleep and just
> run for another cycle. So no, masking is not necessarily required and
> the semantics of MSI is edge type, so the hardware should not fire
> another interrupt _before_ the threaded handler actually took care of
> the initial one.
>
> > Additionally, in this case there isn't even an HW IRQ handler, it's
> > passed as NULL in the request IRQ function in this scenario.
>
> This is completely irrelevant. The primary hardware IRQ handler is
> provided by the core code in this case.
>
You're right.

> Due to the semantics of MSI this is perfectly fine and aside of your
> problem this has worked perfectly fine so far and it's an actual
> performance win because it avoid fiddling with the MSI mask which is
> slow.
>
fiddling with MSI masks is a configuration space write, which is
non-posted, so it does come with a price.
The question is if a test was ever conducted to see the it's better
than spurious IRQ's.

> You still have not told which driver/hardware is affected by this. Can
> you please provide that information so we can finally look at the actual
> hardware/driver combo?
>
Sure,
I'm writing an MSI IRQ controller, it's basically a MIPS GIC interrupt
line which several MSI are multiplexed on it.
It's configured with handle_level_irq() as the GIC is level IRQ.

The ack callback acks the GIC irq.
the mask/unmask calls pci_msi_mask_irq() / pci_msi_unmask_irq()

Thanks,
Ramon.
> Either the driver is broken or the hardware does not comply with the MSI
> spec.
>
> Thanks,
>
> tglx