Re: [tip: x86/bugs] x86/retpoline: Ensure default return thunk isn't used at runtime

From: Borislav Petkov
Date: Thu Jan 04 2024 - 08:12:42 EST


On Wed, Jan 03, 2024 at 07:46:56PM +0100, Borislav Petkov wrote:
> If only I can remember now how we did trigger the warning in the first
> place in order to test it...

Ok, got tired of trying to make it use the default thunk - it seems
kinda hard to do - which is good - or I simply can't think of a good way
to trigger it.

So I went and replaced the jump to the actual thunk:

Dump of assembler code for function default_idle_call:
0xffffffff8197bda0 <+0>: nopw (%rax)
0xffffffff8197bda4 <+4>: nop
...
0xffffffff8197bdda <+58>: xchg %ax,%ax
0xffffffff8197bddc <+60>: sti
0xffffffff8197bddd <+61>: nop
0xffffffff8197bdde <+62>: jmp 0xffffffff81988420 <srso_return_thunk>

to what it is at build time. I.e., what should *not* happen after
patch_returns() as run:

Dump of assembler code for function default_idle_call:
0xffffffff8197bda0 <+0>: nopw (%rax)
0xffffffff8197bda4 <+4>: nop
...
0xffffffff8197bdda <+58>: xchg %ax,%ax
0xffffffff8197bddc <+60>: sti
0xffffffff8197bddd <+61>: nop
0xffffffff8197bdde <+62>: jmp 0xffffffff819884a0 <__x86_return_thunk>

and yap, it fires as expected:

[ 209.051694] **********************************************************
[ 209.053200] ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
[ 209.054435] ** **
[ 209.055687] ** unpatched return thunk in use. This should not **
[ 209.056911] ** on a production kernel. Please report this to **
[ 209.058133] ** x86@xxxxxxxxxx. **
[ 209.059367] ** **
[ 209.060587] ** NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE NOTICE **
[ 209.061808] **********************************************************
[ 209.063064] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 6.7.0-rc8+ #15
[ 209.064527] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 209.066086] Call Trace:
[ 209.066569] <TASK>
[ 209.066975] dump_stack_lvl+0x36/0x50
[ 209.067675] warn_thunk_thunk+0x1a/0x30
[ 209.068405] do_idle+0x1a5/0x1e0
[ 209.069403] cpu_startup_entry+0x29/0x30
[ 209.070147] rest_init+0xc5/0xd0
[ 209.070775] arch_call_rest_init+0xe/0x20
[ 209.071537] start_kernel+0x425/0x680
[ 209.072235] ? set_init_arg+0x80/0x80
[ 209.072931] x86_64_start_reservations+0x18/0x30
[ 209.073803] x86_64_start_kernel+0xb7/0xc0
[ 209.074590] secondary_startup_64_no_verify+0x175/0x17b
[ 209.075584] </TASK>

Lemme write a proper patch.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette