Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

From: Sean Christopherson
Date: Tue Apr 30 2024 - 15:51:12 EST


On Tue, Apr 30, 2024, Borislav Petkov wrote:
> On Tue, Apr 30, 2024 at 11:40:14AM -0700, Sean Christopherson wrote:
> > Hmm, I don't think the problem is that init_ia32_feat_ctl() is called too late.
> > It too is called from the BSP prior to alternative_instructions():
> >
> > arch_cpu_finalize_init()
> > |
> > -> identify_boot_cpu()
> > |
> > -> identify_cpu()
> > |
> > -> .c_init() => init_intel()
>
> Yeah, but look at the his stacktrace:
>
> [ 0.055225][ T0] init_intel (arch/x86/include/asm/msr.h:146 arch/x86/include/asm/msr.h:300 arch/x86/kernel/cpu/intel.c:583
> +arch/x86/kernel/cpu/intel.c:687)
> [ 0.055225][ T0] identify_cpu (arch/x86/kernel/cpu/common.c:1824)
> [ 0.055225][ T0] identify_secondary_cpu (arch/x86/kernel/cpu/common.c:1949)
> [ 0.055225][ T0] smp_store_cpu_info (arch/x86/kernel/smpboot.c:333)
>
> That's after alternatives.
>
> > Ah, and the WARN even specifically checks for the case where there's divergence
> > from the boot CPU:
> >
> > if (boot_cpu_has(feature))
> > WARN_ON(alternatives_patched);
>
> Funny you should mention that - I have this check in
> setup_force_cpu_cap() too which works on boot_cpu_data *BUT*, actually,
> the test in do_clear_cpu_cap() should be:
>
> if (c && cpu_has(c, feature))
> WARN_ON(alternatives_patched);
>
> because setting a feature flag in *any* CPU's cap field is wrong after
> alternatives because as explained earlier.
>
> I know, our feature flags handling is a major mess.

..

> my guess would be no and that init_ia32_feat_ctl() really needs to go
> before alternatives have been patched because it clears flags.

But that would just mask the underlying problem, it wouldn't actually fix anything
other than making the WARN go away. Unless I'm misreading the splat+code, the
issue isn't that init_ia32_feat_ctl() clears VMX late, it's that the BSP sees
VMX as fully enabled, but at least one AP sees VMX as disabled.

I don't see how the kernel can expect to function correctly with divergent feature
support across CPUs, i.e. the WARN is a _good_ thing in this case, because it
alerts the user that their system is messed up, e.g. has a bad BIOS or something.