Re: [PATCH V3 0/4] genirq/affinity: irq vector spread among online CPUs as far as possible

From: Ming Lei
Date: Tue Mar 13 2018 - 04:36:02 EST


On Tue, Mar 13, 2018 at 09:38:41AM +0200, Artem Bityutskiy wrote:
> On Tue, 2018-03-13 at 11:11 +0800, Dou Liyang wrote:
> > I also
> > met the situation that BIOS told to ACPI that it could support
> > physical
> > CPUs hotplug, But actually, there was no hardware slots in the
> > machine.
> > the ACPI tables like user inputs which should be validated when we
> > use.
>
> This is exactly what happens on Skylake Xeon systems. When I check
> dmesg or this file:
>
> /sys/devices/system/cpu/possible
>
> on 2S (two socket) and 4S (four socket) systems, I see the same number
> 432.
>
> This number comes from ACPI MADT. I will speculate (did not see myself)
> that 8S systems will report the same number as well, because of the
> Skylake-SP (Scalable Platform) architecture.
>
> Number 432 is good for 8S systems, but it is way too large for 2S and
> 4S systems - 4x or 2x larger than the theoretical maximum.
>
> I do not know why BIOSes have to report unrealistically high numbers, I
> am just sharing my observation.
>
> So yes, Linux kernel's possible CPU count knowledge may be too large.
> If we use that number to evenly spread IRQ vectors among the CPUs, we
> end up with wasted vectors, and even bugs, as I observe on a 2S
> Skylake.

Then looks this issue need to fix by making possible CPU count accurate
because there are other resources allocated according to num_possible_cpus(),
such as percpu variables.

Thanks,
Ming