Re: [tip:x86/alternatives] [x86/alternatives] ee8962082a: WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap

From: Borislav Petkov
Date: Tue Apr 30 2024 - 13:23:44 EST


+ Sean.

On Tue, Apr 30, 2024 at 11:00:52PM +0800, kernel test robot wrote:
>
>
> Hello,
>
> kernel test robot noticed "WARNING:at_arch/x86/kernel/cpu/cpuid-deps.c:#do_clear_cpu_cap" on:
>
> commit: ee8962082a4413dba1a1b3d3d23490c5221f3b8a ("x86/alternatives: Catch late X86_FEATURE modifiers")
> https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git x86/alternatives
>
> [test failed on linux-next/master bb7a2467e6beef44a80a17d45ebf2931e7631083]
>
> in testcase: lkvs
> version: lkvs-x86_64-b07d44a-1_20240401
> with following parameters:
>
> test: xsave
>
>
>
> compiler: gcc-13
> test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 32G memory
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
> | Closes: https://lore.kernel.org/oe-lkp/202404302233.f27f91b2-oliver.sang@xxxxxxxxx
>
>
> [ 0.055225][ T0] ------------[ cut here ]------------
> [ 0.055225][ T0] WARNING: CPU: 1 PID: 0 at arch/x86/kernel/cpu/cpuid-deps.c:118 do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
> [ 0.055225][ T0] Modules linked in:
> [ 0.055225][ T0] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.9.0-rc3-00001-gee8962082a44 #1
> [ 0.055225][ T0] Hardware name: Gigabyte Technology Co., Ltd. X299 UD4 Pro/X299 UD4 Pro-CF, BIOS F8a 04/27/2021
> [ 0.055225][ T0] RIP: 0010:do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
> [ 0.055225][ T0] Code: 89 c1 83 e0 07 48 c1 e9 03 83 c0 03 0f b6 14 11 38 d0 7c 08 84 d2 0f 85 b7 00 00 00 8b 15 f4 f7 7c 04 85 d2 0f 84 f2 fd ff ff <0f> 0b e9 eb fd ff ff 48 c7 c7 c0 eb 89 85 e8 41 fd ff ff 49 8d bf
> All code
> ========
> 0: 89 c1 mov %eax,%ecx
> 2: 83 e0 07 and $0x7,%eax
> 5: 48 c1 e9 03 shr $0x3,%rcx
> 9: 83 c0 03 add $0x3,%eax
> c: 0f b6 14 11 movzbl (%rcx,%rdx,1),%edx
> 10: 38 d0 cmp %dl,%al
> 12: 7c 08 jl 0x1c
> 14: 84 d2 test %dl,%dl
> 16: 0f 85 b7 00 00 00 jne 0xd3
> 1c: 8b 15 f4 f7 7c 04 mov 0x47cf7f4(%rip),%edx # 0x47cf816
> 22: 85 d2 test %edx,%edx
> 24: 0f 84 f2 fd ff ff je 0xfffffffffffffe1c
> 2a:* 0f 0b ud2 <-- trapping instruction
> 2c: e9 eb fd ff ff jmpq 0xfffffffffffffe1c
> 31: 48 c7 c7 c0 eb 89 85 mov $0xffffffff8589ebc0,%rdi
> 38: e8 41 fd ff ff callq 0xfffffffffffffd7e
> 3d: 49 rex.WB
> 3e: 8d .byte 0x8d
> 3f: bf .byte 0xbf
>
> Code starting with the faulting instruction
> ===========================================
> 0: 0f 0b ud2
> 2: e9 eb fd ff ff jmpq 0xfffffffffffffdf2
> 7: 48 c7 c7 c0 eb 89 85 mov $0xffffffff8589ebc0,%rdi
> e: e8 41 fd ff ff callq 0xfffffffffffffd54
> 13: 49 rex.WB
> 14: 8d .byte 0x8d
> 15: bf .byte 0xbf
> [ 0.055225][ T0] RSP: 0000:ffffc900001f7cd0 EFLAGS: 00010002
> [ 0.055225][ T0] RAX: 0000000000000003 RBX: ffff888817ca9020 RCX: 1ffffffff0b13da3
> [ 0.055225][ T0] RDX: 0000000000000001 RSI: 0000000000000085 RDI: ffffc900001f7d68
> [ 0.055225][ T0] RBP: ffffc900001f7d08 R08: 0000000000000000 R09: fffffbfff0962288
> [ 0.055225][ T0] R10: ffffffff84b11443 R11: 0000000000000001 R12: 0000000000000085
> [ 0.055225][ T0] R13: 0000000000000000 R14: 0000000000000000 R15: ffffffff8683dee0
> [ 0.055225][ T0] FS: 0000000000000000(0000) GS:ffff888817c80000(0000) knlGS:0000000000000000
> [ 0.055225][ T0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.055225][ T0] CR2: 0000000000000000 CR3: 000000089c85a001 CR4: 00000000003706b0
> [ 0.055225][ T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 0.055225][ T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 0.055225][ T0] Call Trace:
> [ 0.055225][ T0] <TASK>
> [ 0.055225][ T0] ? __warn (kernel/panic.c:694)
> [ 0.055225][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
> [ 0.055225][ T0] ? report_bug (lib/bug.c:180 lib/bug.c:219)
> [ 0.055225][ T0] ? handle_bug (arch/x86/kernel/traps.c:239 (discriminator 1))
> [ 0.055225][ T0] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
> [ 0.055225][ T0] ? asm_exc_invalid_op (arch/x86/include/asm/idtentry.h:621)
> [ 0.055225][ T0] ? do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:118 (discriminator 1))
> [ 0.055225][ T0] ? __pfx_do_clear_cpu_cap (arch/x86/kernel/cpu/cpuid-deps.c:109)
> [ 0.055225][ T0] init_ia32_feat_ctl (arch/x86/kernel/cpu/feat_ctl.c:181)

Yap, works as designed:

..
goto update_sgx;

if ( (tboot && !(msr & FEAT_CTL_VMX_ENABLED_INSIDE_SMX)) ||
(!tboot && !(msr & FEAT_CTL_VMX_ENABLED_OUTSIDE_SMX))) {
if (IS_ENABLED(CONFIG_KVM_INTEL))
pr_err_once("VMX (%s TXT) disabled by BIOS\n",
tboot ? "inside" : "outside");
clear_cpu_cap(c, X86_FEATURE_VMX); <--- here
} else {

Clearing feature flags after alternatives have been applied means that
code which does

alternative(, ... X86_FEATURE_VMX, ...)

won't work as expected because the patching has already happened.

And I'm not sure even the dynamic testing *cpu_has() does will always
work as we do this dance in get_cpu_cap() with forced flags.

So, I'm thinking init_ia32_feat_ctl() should run in early_init_intel()
which is before alternatives.

And looking at init_ia32_feat_ctl(), all it does is set and clear
a bunch of bits so I think it should be ok.

Sean?

> [ 0.055225][ T0] init_intel (arch/x86/include/asm/msr.h:146 arch/x86/include/asm/msr.h:300 arch/x86/kernel/cpu/intel.c:583 arch/x86/kernel/cpu/intel.c:687)
> [ 0.055225][ T0] identify_cpu (arch/x86/kernel/cpu/common.c:1824)
> [ 0.055225][ T0] identify_secondary_cpu (arch/x86/kernel/cpu/common.c:1949)
> [ 0.055225][ T0] smp_store_cpu_info (arch/x86/kernel/smpboot.c:333)
> [ 0.055225][ T0] start_secondary (arch/x86/kernel/smpboot.c:197 arch/x86/kernel/smpboot.c:281)
> [ 0.055225][ T0] ? __pfx_start_secondary (arch/x86/kernel/smpboot.c:231)
> [ 0.055225][ T0] common_startup_64 (arch/x86/kernel/head_64.S:421)
> [ 0.055225][ T0] </TASK>
> [ 0.055225][ T0] ---[ end trace 0000000000000000 ]---
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240430/202404302233.f27f91b2-oliver.sang@xxxxxxxxx
>
>
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
>

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette