Re: Affinity setting problem for emulated MSI on PLIC

From: Inochi Amaoto
Date: Thu Jul 24 2025 - 18:26:01 EST


On Thu, Jul 24, 2025 at 01:07:41PM +0200, Thomas Gleixner wrote:
> On Thu, Jul 24 2025 at 09:34, Inochi Amaoto wrote:
> > On Thu, Jul 24, 2025 at 12:50:05AM +0200, Thomas Gleixner wrote:
> >> May I ask the obvious question:
> >>
> >> How did this obviously disfunctional driver gain Tested-by and other
> >> relevant tags?
> >
> > I think the SG2042 pci driver does not support affinity setting, so it
> > is ignored. But the detail thing I will cc Chen Wang. I guess he can give
> > some details.
>
> It does not matter whether the PCI part supports it or not.
>
> PLIC definitely supports it and if the routing entry is not set up, then
> there is no way that an interrupt is delivered. As the routing entry
> write is delayed on startup until irq_enable() is invoked, this never
> happens because of PCI/MSI not having a irq_enable() callback.
>

You are right. As I debug this problem, some interrupts are enabled when
entering irq_set_affinity(). And it does not have IRQD_AFFINITY_MANAGED
flag. So I think the problem is covered by this: the plic_set_affinity()
enables the irq. As these irqs are enabled in an unexpected path, I have
noticed the problem before.

> > For SG2044, I have tested at old version and it worked when submitting.
> > And I guess it is because the commit [1], which remove the irq_set_affinity.
> >
> > [1] https://lore.kernel.org/r/20240723132958.41320-6-marek.vasut+renesas@xxxxxxxxxxx
> >
> > IIRC, the worked version I tested has no affinity set and all irqs
> > are routed to 0, which is different from the behavior now. Another
>
> That does not make any sense. What sets the routing entry for CPU0?
>
> This really needs a coherent explanation. Works by chance is not an
> explanation at all :)
>

Yeah, I know. I did not dig in when it is worked. This does teach me a
big lesson this time....

As the problem is covered by the plic_set_affinity, I think it may be
caused by the same problem. Routing to CPU0 is not the real reason,
setting affinity after enable does this trick for the problem.

Regards,
Inochi