Re: unchecked MSR access error: WRMSR to 0xd84 (tried to write 0x0000000000010003) at rIP: 0xffffffffa025a1b8 (snbep_uncore_msr_init_box+0x38/0x60 [intel_uncore])
From: Borislav Petkov
Date: Tue Mar 05 2024 - 07:10:41 EST
On Tue, Mar 05, 2024 at 11:14:04AM +0100, Thomas Gleixner wrote:
> It seems that none of the consumers of topology_num_cores_per_package()
> can actually be used on virt, so a reasonable restriction is to reject
> non-present CPUs on bare metal. Something like the below.
Yeah, workie.
Reported-by: Borislav Petkov (AMD) <bp@xxxxxxxxx>
Tested-by: Borislav Petkov (AMD) <bp@xxxxxxxxx>
Some relevant diffs of dmesg before and after:
+ACPI: Ignoring non-present APIC ID on bare metal
-CPU topo: Num. cores per package: 16
-CPU topo: Num. threads per package: 32
-CPU topo: Allowing 8 present CPUs plus 24 hotplug CPUs
+CPU topo: Num. cores per package: 4
+CPU topo: Num. threads per package: 8
+CPU topo: Allowing 8 present CPUs plus 0 hotplug CPUs
-setup_percpu: NR_CPUS:256 nr_cpumask_bits:32 nr_cpu_ids:32 nr_node_ids:1
+setup_percpu: NR_CPUS:256 nr_cpumask_bits:8 nr_cpu_ids:8 nr_node_ids:1
-pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07
-pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15
-pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23
-pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31
+pcpu-alloc: [0] 0 1 2 3 [0] 4 5 6 7
Those hotpluggable CPUs ended up wasting percpu mem too.
As a result, APIC is not in physical flat mode anymore:
-APIC: Switched APIC routing to: physical flat
I guess ship it but we'll pay attention to what else ends up
complaining.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette