[REGRESSION] Build failure on 6.9-rc2 with "x86/bugs: Fix the SRSO mitigation on Zen3/4"

From: Mateusz Jończyk
Date: Thu Apr 04 2024 - 17:12:23 EST


Hello,

The v6.9-rc2 kernel fails to build without CONFIG_MITIGATION_SRSO but
with most other mitigations in place (incl. CONFIG_MITIGATION_UNRET_ENTRY):

[...]
      LD      vmlinux.o
      OBJCOPY modules.builtin.modinfo
      GEN     modules.builtin
      GEN     .vmlinux.objs
      MODPOST Module.symvers
    ERROR: modpost: "srso_alias_untrain_ret" [arch/x86/kvm/kvm-amd.ko] undefined!
    make[2]: *** [scripts/Makefile.modpost:145: Module.symvers] Błąd 1
    make[1]: *** [/media/1T-data/linux/linux-6.9-rc2/Makefile:1871: modpost] Błąd 2
    make: *** [Makefile:240: __sub-make] Błąd 2
    Command exited with non-zero status 2

An investigation pointed to the following commit:

commit 4535e1a4174c4111d92c5a9a21e542d232e0fcaa
Author: Borislav Petkov (AMD) <bp@xxxxxxxxx>
Date:   Thu Mar 28 13:59:05 2024 +0100

    x86/bugs: Fix the SRSO mitigation on Zen3/4
    
    The original version of the mitigation would patch in the calls to the
    untraining routines directly.  That is, the alternative() in UNTRAIN_RET
    will patch in the CALL to srso_alias_untrain_ret() directly.
    
    However, even if commit e7c25c441e9e ("x86/cpu: Cleanup the untrain
    mess") meant well in trying to clean up the situation, due to micro-
    architectural reasons, the untraining routine srso_alias_untrain_ret()
    must be the target of a CALL instruction and not of a JMP instruction as
    it is done now.
    
    Reshuffle the alternative macros to accomplish that.
    
    Fixes: e7c25c441e9e ("x86/cpu: Cleanup the untrain mess")
    Signed-off-by: Borislav Petkov (AMD) <bp@xxxxxxxxx>
    Reviewed-by: Ingo Molnar <mingo@xxxxxxxxxx>
    Cc: stable@xxxxxxxxxx
    Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>

After reverting it, the kernel builds successfully.

Config extract:

    CONFIG_CC_HAS_RETURN_THUNK=y
    CONFIG_CC_HAS_ENTRY_PADDING=y
    CONFIG_FUNCTION_PADDING_CFI=11
    CONFIG_FUNCTION_PADDING_BYTES=16
    CONFIG_CALL_PADDING=y
    CONFIG_HAVE_CALL_THUNKS=y
    CONFIG_CALL_THUNKS=y
    CONFIG_PREFIX_SYMBOLS=y
    CONFIG_SPECULATION_MITIGATIONS=y
    CONFIG_MITIGATION_PAGE_TABLE_ISOLATION=y
    CONFIG_MITIGATION_RETPOLINE=y
    CONFIG_MITIGATION_RETHUNK=y
    CONFIG_MITIGATION_UNRET_ENTRY=y
    CONFIG_MITIGATION_CALL_DEPTH_TRACKING=y
    # CONFIG_CALL_THUNKS_DEBUG is not set
    CONFIG_MITIGATION_IBPB_ENTRY=y
    CONFIG_MITIGATION_IBRS_ENTRY=y
    # CONFIG_MITIGATION_SRSO is not set
    # CONFIG_MITIGATION_GDS_FORCE is not set
    # CONFIG_MITIGATION_RFDS is not set
    CONFIG_ARCH_HAS_ADD_PAGES=y

OS: Ubuntu 20.04, GCC 9.4.0

To me, it looks that with the patch applied, arch/x86/include/asm/nospec-branch.h uses
srso_alias_untrain_ret when CONFIG_MITIGATION_UNRET_ENTRY=y
even though CONFIG_MITIGATION_SRSO=n.

Greetings,

Mateusz