Re: nr_cpu_ids vs AMD 3970x(32 physical CPUs)

From: Uladzislau Rezki
Date: Fri Jul 03 2020 - 13:24:26 EST


> >
> > I have a system based on AMD 3970x CPUs. It has 32 physical cores
> > and 64 threads. It seems that "nr_cpu_ids" variable is not correctly
> > set on latest 5.8-rc3 kernel. Please have a look below on dmesg output:
> >
> > <snip>
> > urezki@pc638:~$ sudo dmesg | grep 128
> > [ 0.000000] IOAPIC[0]: apic_id 128, version 33, address 0xfec00000, GSI 0-23
> > [ 0.000000] smpboot: Allowing 128 CPUs, 64 hotplug CPUs
> > [ 0.000000] setup_percpu: NR_CPUS:512 nr_cpumask_bits:512 nr_cpu_ids:128 nr_node_ids:1
> > ...
> > [ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=128, Nodes=1
> > [ 0.000000] rcu: RCU restricting CPUs from NR_CPUS=512 to nr_cpu_ids=128.
> > [ 0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=128
> > urezki@pc638:~$
> > <snip>
> >
> > For example SLUB thinks that it deals with 128 CPUs in the system what is
> > wrong if i do not miss something. Since nr_cpu_ids is broken(?), thus the
> > "cpu_possible_mask" does not correspond to reality as well.
> >
> > Any thoughts?
>
> This is not a 5.8-rc3 problem. Almost all AMD CPUs and APUs are
> looking like this.
> The only CPUs I own are getting that right is a dual EPYC box,
> everything else is broken
> regarding the right C/T & socket(s) count, and that probably bc is
> using NUAM code
> to have the info.
>
> I reported that a while back and no-one ever cared.
>
> There is even a comment in the hotplug code saying setting the wrong CPU count
> is a waste of resources.
>
> I have a 2200G is reporting 48Cores.
>
> AMD Ryzen 7 3750H reporting twice the cores and twice the socket.
>
> ...
>
> [ 0.040578] smpboot: Allowing 16 CPUs, 8 hotplug CPUs
> ...
> [ 0.382122] smpboot: Max logical packages: 2
> ..
>
> I boot all the boxes restricting the cores to the correct count on the
> command line.
>
> Wasted resource or not, this is still a bug IMO.
>
I suspect that DEFINE_PER_CPU variables can be twice as big,
but i have not checked it actually. So, if the code needs to
identify real number of CPUs it can be a challenge :)

Thanks.

--
Vlad Rezki