RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq

From: Saurabh Singh Sengar
Date: Tue Mar 14 2023 - 06:24:46 EST




> -----Original Message-----
> From: Borislav Petkov <bp@xxxxxxxxx>
> Sent: Monday, March 13, 2023 4:44 PM
> To: Saurabh Singh Sengar <ssengar@xxxxxxxxxxxxx>
> Cc: Saurabh Sengar <ssengar@xxxxxxxxxxxxxxxxxxx>; tglx@xxxxxxxxxxxxx;
> mingo@xxxxxxxxxx; dave.hansen@xxxxxxxxxxxxxxx; x86@xxxxxxxxxx;
> hpa@xxxxxxxxx; johan+linaro@xxxxxxxxxx; isaku.yamahata@xxxxxxxxx;
> Michael Kelley (LINUX) <mikelley@xxxxxxxxxxxxx>; linux-
> kernel@xxxxxxxxxxxxxxx; rahul.tanwar@xxxxxxxxxxxxxxx;
> andriy.shevchenko@xxxxxxxxx
> Subject: Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq
>
> On Mon, Mar 13, 2023 at 03:29:32AM +0000, Saurabh Singh Sengar wrote:
> > To be specific in our system which is a guest VM we don't need IO-APIC
> > and hence there is no device tree node for it. It is observed that we get irq 0
> assigned to PCI-MSI.
>
> This should be added to your commit message: what guest VM is that and
> why should the kernel support it.

Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux
documentation is in Documentation/virt/hyperv/.

In commit I wanted to mention that any system which is not registering IO-APIC
will have this issue. But I am fine to mention specifically about the issue I am facing.
As part of your next comment, I have explained the issue in detail if that is good, I
can put that as commit message.

>
> Why doesn't it need an IO-APIC and why does the current code need to be
> changed just for your guest VM?

For Hyper-V Virtual Machines, few platforms don't have any devices to be
hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which
assigns interrupts to PCIe devices. In such platforms IO-APIC is not
registered which causes gsi_top value to remain at 0 and not get properly
assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC
flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound
function in io_apic.c decides the lower bound of irq numbers based on gsi_top.

Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first
virq number because gsi_top is still 0. 0 being invalid virq is ignored by
MSI irq domain and results allocation of the same PCIe MSI twice.

CPU0 CPU1
0: 2 0 Hyper-V PCIe MSI 1073741824-edge
1: 69 0 Hyper-V PCIe MSI 1073741824-edge nvme0q0

To avoid this issue, if IO-APIC and gsi_top are not initialized, return the
hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0.
This will also be identical to the behaviour of weak arch_dynirq_lower_bound
function defined in kernel/softirq.c.

>
> What else needs to be changed so that your VM works?

This is the only change required.

>
> Where is that VM's documentation and why can't that VM be fixed *not* to
> need kernel changes? IOW, why can't that VM emulate an IO-APIC like the
> others do...

Documentation is mentioned above. As there is no need of IO-APIC there is
no need emulating it.

Please let me know if there is any further clarification required.

>
> --
> Regards/Gruss,
> Boris.
>
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeopl
> e.kernel.org%2Ftglx%2Fnotes-about-
> netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C817c78e7bb324
> 8cd73b708db23b41c2a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%
> 7C638143028755917117%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA
> wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C
> %7C&sdata=3N5Mkl2gjMPHKOJGykZ3LvM6h%2FfD86dXLTQo3VH0Svc%3D&re
> served=0