Re: [PATCH] PCI: mvebu: Use devm_request_irq() for registering interrupt handler
From: Marc Zyngier
Date: Thu Jun 23 2022 - 16:31:44 EST
Hi Bjorn,
On Thu, 23 Jun 2022 17:49:42 +0100,
Bjorn Helgaas <helgaas@xxxxxxxxxx> wrote:
>
> [+cc Marc, IRQ affinity vs chained IRQ handlers]
>
> On Thu, Jun 23, 2022 at 06:32:40PM +0200, Pali Rohár wrote:
> > On Thursday 23 June 2022 11:27:47 Bjorn Helgaas wrote:
> > > On Tue, May 24, 2022 at 02:28:17PM +0200, Pali Rohár wrote:
> > > > Same as in commit a3b69dd0ad62 ("Revert "PCI: aardvark: Rewrite
> > > > IRQ code to chained IRQ handler"") for pci-aardvark driver, use
> > > > devm_request_irq() instead of chained IRQ handler in pci-mvebu.c
> > > > driver.
> > > >
> > > > This change fixes affinity support and allows to pin interrupts
> > > > from different PCIe controllers to different CPU cores.
> > >
> > > Several other drivers use irq_set_chained_handler_and_data(). Do
> > > any of them need similar changes?
> >
> > I do not know. This needs testing on HW which use those other
> > drivers.
> >
> > > The commit log suggests that using chained IRQ handlers breaks
> > > affinity support. But perhaps that's not the case and the real
> > > culprit is some other difference between mvebu and the other
> > > drivers.
> >
> > It is possible. But similar patch (revert; linked below) was
> > required for aardvark. I tested same approach on mvebu and it fixed
> > affinity support.
>
> This feels like something we should understand better. If
> irq_set_chained_handler_and_data() is a problem for affinity, we
> should fix it across the board in all the drivers at once.
>
> If the real problem is something different, we should figure that out
> and document it in the commit log.
>
> I cc'd Marc in case he has time to educate us.
Thanks for roping me in.
The problem of changing affinities for chained (or multiplexing)
interrupts is, to make it short, that it breaks the existing userspace
ABI that a change in affinity affects only the interrupt userspace
acts upon, and no other. Which is why we don't expose any affinity
setting for such an interrupt, as by definition changing its affinity
affects all the interrupts that are muxed onto it.
By turning a chained interrupt into a normal handler, people work
around the above rule and break the contract the kernel has with
userspace.
Do I think this is acceptable? No. Can something be done about this?
Maybe.
Marek asked this exact question last month[1], and I gave a detailed
explanation of what could be done to improve matters, allowing this to
happen as long as userspace is made aware of the effects, which means
creating a new userspace ABI.
I would rather see people work on a proper solution than add bad hacks
that only work in environments where the userspace ABI can be safely
ignored, such as on an closed, embedded device.
HTH,
M.
[1] https://lore.kernel.org/all/20220502102137.764606ee@thinkpad/
--
Without deviation from the norm, progress is not possible.