Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

From: Borislav Petkov
Date: Tue Apr 30 2024 - 15:33:56 EST


On Tue, Apr 30, 2024 at 11:40:14AM -0700, Sean Christopherson wrote:
> Hmm, I don't think the problem is that init_ia32_feat_ctl() is called too late.
> It too is called from the BSP prior to alternative_instructions():
>
> arch_cpu_finalize_init()
> |
> -> identify_boot_cpu()
> |
> -> identify_cpu()
> |
> -> .c_init() => init_intel()

Yeah, but look at the his stacktrace:

[ 0.055225][ T0] init_intel (arch/x86/include/asm/msr.h:146 arch/x86/include/asm/msr.h:300 arch/x86/kernel/cpu/intel.c:583
+arch/x86/kernel/cpu/intel.c:687)
[ 0.055225][ T0] identify_cpu (arch/x86/kernel/cpu/common.c:1824)
[ 0.055225][ T0] identify_secondary_cpu (arch/x86/kernel/cpu/common.c:1949)
[ 0.055225][ T0] smp_store_cpu_info (arch/x86/kernel/smpboot.c:333)

That's after alternatives.

> Ah, and the WARN even specifically checks for the case where there's divergence
> from the boot CPU:
>
> if (boot_cpu_has(feature))
> WARN_ON(alternatives_patched);

Funny you should mention that - I have this check in
setup_force_cpu_cap() too which works on boot_cpu_data *BUT*, actually,
the test in do_clear_cpu_cap() should be:

if (c && cpu_has(c, feature))
WARN_ON(alternatives_patched);

because setting a feature flag in *any* CPU's cap field is wrong after
alternatives because as explained earlier.

I know, our feature flags handling is a major mess.

> So I think this is a "real" warning about a misconfigured system, where VMX is
> fully configured in MSR_IA32_FEAT_CTL on the boot CPU, but is disabled on a
> secondary CPU.

And that's yet another issue. And it already warns about it:

[ 0.835741][ T1] smpboot: x86: Booting SMP configuration:
[ 0.836040][ T1] .... node #0, CPUs: #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13 #14 #15 #16 #17
[ 0.055225][ T0] masked ExtINT on CPU#1
[ 0.055225][ T0] x86/cpu: VMX (outside TXT) disabled by BIOS
^^^^^^^^^^^^^^^^^^^^

Oliver, does the second warning go away if you do this?

---
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index 5dd427c6feb2..93fa2afc0c67 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -114,7 +114,7 @@ static void do_clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int feature)
if (WARN_ON(feature >= MAX_FEATURE_BITS))
return;

- if (boot_cpu_has(feature))
+ if (c && cpu_has(c, feature))
WARN_ON(alternatives_patched);

clear_feature(c, feature);

--

my guess would be no and that init_ia32_feat_ctl() really needs to go
before alternatives have been patched because it clears flags.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette