Re: IRQ affinity not working on Xen pci-platform device

From: Thomas Gleixner
Date: Fri Mar 03 2023 - 19:28:24 EST


David!

On Fri, Mar 03 2023 at 16:54, David Woodhouse wrote:
> On Fri, 2023-03-03 at 17:51 +0100, Thomas Gleixner wrote:
>> >
>> > [    0.577173] ACPI: \_SB_.LNKC: Enabled at IRQ 11
>> > [    0.578149] The affinity mask was 0-3
>> > [    0.579081] The affinity mask is 0-3 and the handler is on 2
>> > [    0.580288] The affinity mask is 0 and the handler is on 2
>>
>> What happens is that once the interrupt is requested, the affinity
>> setting is deferred to the first interrupt. See the marvelous dance in
>> arch/x86/kernel/apic/msi.c::msi_set_affinity().
>>
>> If you do the setting before request_irq() then the startup will assign
>> it to the target mask right away.
>>
>> Btw, you are using irq_get_affinity_mask(), which gives you the desired
>> target mask. irq_get_effective_affinity_mask() gives you the real one.
>>
>> Can you verify that the thing moves over after the first interrupt or is
>> that too late already?
>
> It doesn't seem to move. The hack to just return IRQ_NONE if invoked on
> CPU != 0 was intended to do just that. It's a level-triggered interrupt
> so when the handler does nothing on the "wrong" CPU, it ought to get
> invoked again on the *correct* CPU and actually work that time.

So much for the theory. This is virt after all so it does not
necessarily behave like real hardware.

> But no, as the above logs show, it gets invoked twice *both* on CPU2.

Duh. I missed that. Can you instrument whether this ends up in in the
actual irq affinity setter function of the underlying irq chip at all?

> If I do the setting before request_irq() then it should assign it right
> away (unless that IRQ was already in use?

Correct.

> It's theoretically a shared PCI INTx line). But even then, that would
> mean I'm messing with affinity on an IRQ I haven't even requested yet
> and don't own?

Well, that's not any different from userspace changing the affinity of
an interrupt.

Thanks,

tglx