Re: [BUG net-next] arch/x86/kernel/cpu/bugs.c:2935: "Unpatched return thunk in use. This should not happen!" [STACKTRACE]

From: Mirsad Todorovac
Date: Tue Mar 19 2024 - 21:29:34 EST


On 3/18/24 21:21, Borislav Petkov wrote:
On Mon, Mar 18, 2024 at 08:47:26PM +0100, Mirsad Todorovac wrote:
With the latest net-next v6.8-5204-g237bb5f7f7f5 kernel, while running kselftest, there was this
trap and stacktrace:

Send your kernel .config and how exactly you're triggering it, please.

Thx.

Hi,

Please find the kernel .config attached.

I got another one of these "Unpatched thunk" and it seems connected with selftest/kvm.

But running selftests/kvm one by one did not trigger the bug.

Best regards,
Mirsad Todorovac

Attachment: config-6.8.0-net-next-km-05204-g237bb5f7f7f5-dirty.xz
Description: application/xz

Mar 19 20:07:54 defiant kernel: [ 885.324733] ------------[ cut here ]------------
Mar 19 20:07:54 defiant kernel: [ 885.324737] Unpatched return thunk in use. This should not happen!
Mar 19 20:07:54 defiant kernel: [ 885.324740] WARNING: CPU: 14 PID: 7842 at arch/x86/kernel/cpu/bugs.c:2935 __warn_thunk (arch/x86/kernel/cpu/bugs.c:2935 (discriminator 3))
Mar 19 20:07:54 defiant kernel: [ 885.324746] Modules linked in: xfrm_user nf_tables nfnetlink nvme_fabrics binfmt_misc nls_iso8859_1 intel_rapl_msr amd_atl snd_hda_codec_realtek intel_rapl_common snd_hda_codec_generic snd_hda_scodec_component snd_hda_codec_hdmi snd_hda_intel snd_intel_dspcfg edac_mce_amd snd_intel_sdw_acpi crct10dif_pclmul snd_hda_codec polyval_clmulni polyval_generic snd_hda_core ghash_clmulni_intel snd_hwdep sha512_ssse3 sha256_ssse3 amdgpu sha1_ssse3 snd_pcm aesni_intel snd_seq_midi crypto_simd snd_seq_midi_event cryptd snd_rawmidi amdxcp drm_exec joydev rapl gpu_sched wmi_bmof input_leds snd_seq drm_buddy drm_suballoc_helper drm_ttm_helper snd_seq_device ttm snd_timer k10temp drm_display_helper ccp cec snd drm_kms_helper soundcore i2c_algo_bit mac_hid tcp_bbr msr parport_pc ppdev lp parport drm efi_pstore ip_tables x_tables autofs4 btrfs blake2b_generic xor raid6_pq hid_generic nvme ahci nvme_core xhci_pci r8169 crc32_pclmul i2c_piix4 nvme_auth libahci xhci_pci_renesas realtek video wmi gpio_amdpt
Mar 19 20:07:54 defiant kernel: [ 885.324811] CPU: 14 PID: 7842 Comm: cpuid_test Not tainted 6.8.0-torv-11167-g4438a810f396-dirty #34
Mar 19 20:07:54 defiant kernel: [ 885.324814] Hardware name: ASRock X670E PG Lightning/X670E PG Lightning, BIOS 1.21 04/26/2023
Mar 19 20:07:54 defiant kernel: [ 885.324815] RIP: 0010:__warn_thunk (arch/x86/kernel/cpu/bugs.c:2935 (discriminator 3))
Mar 19 20:07:54 defiant kernel: [ 885.324818] Code: 62 66 1d 01 83 e3 01 74 0e 48 8b 5d f8 c9 31 f6 31 ff e9 8e 99 3b 01 48 c7 c7 d8 11 81 b3 c6 05 f2 2f 8d 02 01 e8 00 ab 07 00 <0f> 0b 48 8b 5d f8 c9 31 f6 31 ff e9 6b 99 3b 01 90 90 90 90 90 90
All code
========
0: 62 66 1d 01 83 (bad)
5: e3 01 jrcxz 0x8
7: 74 0e je 0x17
9: 48 8b 5d f8 mov -0x8(%rbp),%rbx
d: c9 leave
e: 31 f6 xor %esi,%esi
10: 31 ff xor %edi,%edi
12: e9 8e 99 3b 01 jmp 0x13b99a5
17: 48 c7 c7 d8 11 81 b3 mov $0xffffffffb38111d8,%rdi
1e: c6 05 f2 2f 8d 02 01 movb $0x1,0x28d2ff2(%rip) # 0x28d3017
25: e8 00 ab 07 00 call 0x7ab2a
2a:* 0f 0b ud2 <-- trapping instruction
2c: 48 8b 5d f8 mov -0x8(%rbp),%rbx
30: c9 leave
31: 31 f6 xor %esi,%esi
33: 31 ff xor %edi,%edi
35: e9 6b 99 3b 01 jmp 0x13b99a5
3a: 90 nop
3b: 90 nop
3c: 90 nop
3d: 90 nop
3e: 90 nop
3f: 90 nop

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 48 8b 5d f8 mov -0x8(%rbp),%rbx
6: c9 leave
7: 31 f6 xor %esi,%esi
9: 31 ff xor %edi,%edi
b: e9 6b 99 3b 01 jmp 0x13b997b
10: 90 nop
11: 90 nop
12: 90 nop
13: 90 nop
14: 90 nop
15: 90 nop
Mar 19 20:07:54 defiant kernel: [ 885.324819] RSP: 0018:ffffadb65373bc30 EFLAGS: 00010046
Mar 19 20:07:54 defiant kernel: [ 885.324821] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
Mar 19 20:07:54 defiant kernel: [ 885.324822] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
Mar 19 20:07:54 defiant kernel: [ 885.324823] RBP: ffffadb65373bc38 R08: 0000000000000000 R09: 0000000000000000
Mar 19 20:07:54 defiant kernel: [ 885.324824] R10: 0000000000000000 R11: 0000000000000000 R12: ffff919b06ac8000
Mar 19 20:07:54 defiant kernel: [ 885.324825] R13: 0000000000000000 R14: 0000000000000000 R15: ffff919b06ac8780
Mar 19 20:07:54 defiant kernel: [ 885.324826] FS: 00007447cc59e740(0000) GS:ffff91a858100000(0000) knlGS:0000000000000000
Mar 19 20:07:54 defiant kernel: [ 885.324827] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 19 20:07:54 defiant kernel: [ 885.324828] CR2: 0000000000000000 CR3: 00000002c68f6000 CR4: 0000000000f50ef0
Mar 19 20:07:54 defiant kernel: [ 885.324830] PKRU: 55555554
Mar 19 20:07:54 defiant kernel: [ 885.324831] Call Trace:
Mar 19 20:07:54 defiant kernel: [ 885.324831] <TASK>
Mar 19 20:07:54 defiant kernel: [ 885.324833] ? show_regs (arch/x86/kernel/dumpstack.c:479)
Mar 19 20:07:54 defiant kernel: [ 885.324836] ? __warn_thunk (arch/x86/kernel/cpu/bugs.c:2935 (discriminator 3))
Mar 19 20:07:54 defiant kernel: [ 885.324838] ? __warn (kernel/panic.c:677)
Mar 19 20:07:54 defiant kernel: [ 885.324841] ? __warn_thunk (arch/x86/kernel/cpu/bugs.c:2935 (discriminator 3))
Mar 19 20:07:54 defiant kernel: [ 885.324843] ? report_bug (lib/bug.c:201 lib/bug.c:219)
Mar 19 20:07:54 defiant kernel: [ 885.324846] ? irq_work_queue (kernel/irq_work.c:119)
Mar 19 20:07:54 defiant kernel: [ 885.324849] ? handle_bug (arch/x86/kernel/traps.c:218)
Mar 19 20:07:54 defiant kernel: [ 885.324853] ? exc_invalid_op (arch/x86/kernel/traps.c:260 (discriminator 1))
Mar 19 20:07:54 defiant kernel: [ 885.324855] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:621)
Mar 19 20:07:54 defiant kernel: [ 885.324860] ? __warn_thunk (arch/x86/kernel/cpu/bugs.c:2935 (discriminator 3))
Mar 19 20:07:54 defiant kernel: [ 885.324863] warn_thunk_thunk (arch/x86/entry/entry.S:48)
Mar 19 20:07:54 defiant kernel: [ 885.324867] svm_vcpu_enter_exit (./include/linux/kvm_host.h:543 arch/x86/kvm/svm/svm.c:4115)
Mar 19 20:07:54 defiant kernel: [ 885.324869] svm_vcpu_run (arch/x86/kvm/svm/svm.c:4187)
Mar 19 20:07:54 defiant kernel: [ 885.324872] kvm_arch_vcpu_ioctl_run (arch/x86/kvm/x86.c:11003 arch/x86/kvm/x86.c:11184 arch/x86/kvm/x86.c:11410)
Mar 19 20:07:54 defiant kernel: [ 885.324877] ? kvm_vcpu_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4610)
Mar 19 20:07:54 defiant kernel: [ 885.324881] kvm_vcpu_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:4447)
Mar 19 20:07:54 defiant kernel: [ 885.324883] ? vcpu_put (./arch/x86/include/asm/preempt.h:103 arch/x86/kvm/../../../virt/kvm/kvm_main.c:225)
Mar 19 20:07:54 defiant kernel: [ 885.324886] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
Mar 19 20:07:54 defiant kernel: [ 885.324888] __x64_sys_ioctl (fs/ioctl.c:51 fs/ioctl.c:904 fs/ioctl.c:890 fs/ioctl.c:890)
Mar 19 20:07:54 defiant kernel: [ 885.324892] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
Mar 19 20:07:54 defiant kernel: [ 885.324893] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
Mar 19 20:07:54 defiant kernel: [ 885.324895] ? trace_hardirqs_on_prepare (kernel/trace/trace_preemptirq.c:47 kernel/trace/trace_preemptirq.c:42)
Mar 19 20:07:54 defiant kernel: [ 885.324897] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
Mar 19 20:07:54 defiant kernel: [ 885.324899] ? syscall_exit_to_user_mode (kernel/entry/common.c:215)
Mar 19 20:07:54 defiant kernel: [ 885.324901] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
Mar 19 20:07:54 defiant kernel: [ 885.324903] ? do_syscall_64 (./arch/x86/include/asm/cpufeature.h:171 arch/x86/entry/common.c:98)
Mar 19 20:07:54 defiant kernel: [ 885.324904] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
Mar 19 20:07:54 defiant kernel: [ 885.324906] ? irqentry_exit (kernel/entry/common.c:361)
Mar 19 20:07:54 defiant kernel: [ 885.324907] ? srso_alias_return_thunk (arch/x86/lib/retpoline.S:181)
Mar 19 20:07:54 defiant kernel: [ 885.324909] ? exc_page_fault (arch/x86/mm/fault.c:1567)
Mar 19 20:07:54 defiant kernel: [ 885.324911] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
Mar 19 20:07:54 defiant kernel: [ 885.324913] RIP: 0033:0x7447cc31a94f
Mar 19 20:07:54 defiant kernel: [ 885.324933] Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff 77 1f 48 8b 44 24 18 64 48 2b 04 25 28 00
All code
========
0: 00 48 89 add %cl,-0x77(%rax)
3: 44 24 18 rex.R and $0x18,%al
6: 31 c0 xor %eax,%eax
8: 48 8d 44 24 60 lea 0x60(%rsp),%rax
d: c7 04 24 10 00 00 00 movl $0x10,(%rsp)
14: 48 89 44 24 08 mov %rax,0x8(%rsp)
19: 48 8d 44 24 20 lea 0x20(%rsp),%rax
1e: 48 89 44 24 10 mov %rax,0x10(%rsp)
23: b8 10 00 00 00 mov $0x10,%eax
28: 0f 05 syscall
2a:* 41 89 c0 mov %eax,%r8d <-- trapping instruction
2d: 3d 00 f0 ff ff cmp $0xfffff000,%eax
32: 77 1f ja 0x53
34: 48 8b 44 24 18 mov 0x18(%rsp),%rax
39: 64 fs
3a: 48 rex.W
3b: 2b .byte 0x2b
3c: 04 25 add $0x25,%al
3e: 28 00 sub %al,(%rax)

Code starting with the faulting instruction
===========================================
0: 41 89 c0 mov %eax,%r8d
3: 3d 00 f0 ff ff cmp $0xfffff000,%eax
8: 77 1f ja 0x29
a: 48 8b 44 24 18 mov 0x18(%rsp),%rax
f: 64 fs
10: 48 rex.W
11: 2b .byte 0x2b
12: 04 25 add $0x25,%al
14: 28 00 sub %al,(%rax)
Mar 19 20:07:54 defiant kernel: [ 885.324934] RSP: 002b:00007ffd611e2f50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Mar 19 20:07:54 defiant kernel: [ 885.324936] RAX: ffffffffffffffda RBX: 0000000012d56880 RCX: 00007447cc31a94f
Mar 19 20:07:54 defiant kernel: [ 885.324937] RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000007
Mar 19 20:07:54 defiant kernel: [ 885.324938] RBP: 00007447cc59e6c0 R08: 0000000000000000 R09: 0000000000000001
Mar 19 20:07:54 defiant kernel: [ 885.324939] R10: 000000000000001f R11: 0000000000000246 R12: 0000000012d56880
Mar 19 20:07:54 defiant kernel: [ 885.324940] R13: 0000000000000041 R14: 0000000000427e18 R15: 00007447cc601040
Mar 19 20:07:54 defiant kernel: [ 885.324943] </TASK>