[linus:master] [x86/module] 5185e7f9f3: BUG:soft_lockup-CPU##stuck_for#s![perf:#]
From: kernel test robot
Date: Thu Dec 12 2024 - 09:38:07 EST
Hello,
kernel test robot noticed "BUG:soft_lockup-CPU##stuck_for#s![perf:#]" on:
commit: 5185e7f9f3bd754ab60680814afd714e2673ef88 ("x86/module: enable ROX caches for module text on 64 bit")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
[test failed on linus/master 7503345ac5f5e82fd9a36d6e6b447c016376403a]
[test failed on linux-next/master ebe1b11614e079c5e366ce9bd3c8f44ca0fbcc1b]
in testcase: lkvs
version: lkvs-x86_64-2187c57-1_20241102
with following parameters:
test: pt
config: x86_64-dcg_x86_64_defconfig-func
compiler: gcc-12
test machine: 224 threads 2 sockets Intel(R) Xeon(R) Platinum 8480+ (Sapphire Rapids) with 256G memory
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@xxxxxxxxx>
| Closes: https://lore.kernel.org/oe-lkp/202412122201.67d21c2b-lkp@xxxxxxxxx
[ 737.450753][ C63] watchdog: BUG: soft lockup - CPU#63 stuck for 26s! [perf:95490]
[ 737.460225][ C63] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common intel_ifs i10nm_edac skx_edac_common nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp dax_hmem ofpart iTCO_wdt cxl_acpi qat_4xxx intel_pmc_bxt kvm_intel spi_nor cxl_port pmt_telemetry iTCO_vendor_support mtd ipmi_ssif kvm isst_if_mbox_pci intel_th_gth isst_if_mmio intel_sdsi pmt_class intel_qat i2c_i801 cxl_core mei_me spi_intel_pci pinctrl_emmitsburg ast crc32c_intel einj pinctrl_intel intel_th_pci dh_generic drm_shmem_helper cdc_ether isst_if_common idxd mei i2c_smbus crc8 intel_vsec intel_th i2c_ismt spi_intel ipmi_si joydev pwm_lpss acpi_power_meter btrfs binfmt_misc fuse ip_tables
[ 737.536652][ C63] CPU: 63 UID: 0 PID: 95490 Comm: perf Tainted: G S 6.12.0-rc6-00142-g5185e7f9f3bd #1
[ 737.549630][ C63] Tainted: [S]=CPU_OUT_OF_SPEC
[ 737.555668][ C63] Hardware name: Intel Corporation D50DNP1SBB/D50DNP1SBB, BIOS SE5C7411.86B.8118.D04.2206151341 06/15/2022
[ 737.569144][ C63] RIP: 0010:find_vmap_area_exceed_addr_lock (mm/vmalloc.c:1034 mm/vmalloc.c:1066)
[ 737.577511][ C63] Code: 89 f8 48 c1 e8 03 42 80 3c 38 00 0f 85 62 02 00 00 48 8b 5b 10 48 85 db 74 3b 48 8d 7b f8 48 89 f8 48 c1 e8 03 42 80 3c 38 00 <0f> 85 f1 01 00 00 4c 3b 73 f8 72 a9 48 8d 7b 08 48 89 f8 48 c1 e8
All code
========
0: 89 f8 mov %edi,%eax
2: 48 c1 e8 03 shr $0x3,%rax
6: 42 80 3c 38 00 cmpb $0x0,(%rax,%r15,1)
b: 0f 85 62 02 00 00 jne 0x273
11: 48 8b 5b 10 mov 0x10(%rbx),%rbx
15: 48 85 db test %rbx,%rbx
18: 74 3b je 0x55
1a: 48 8d 7b f8 lea -0x8(%rbx),%rdi
1e: 48 89 f8 mov %rdi,%rax
21: 48 c1 e8 03 shr $0x3,%rax
25: 42 80 3c 38 00 cmpb $0x0,(%rax,%r15,1)
2a:* 0f 85 f1 01 00 00 jne 0x221 <-- trapping instruction
30: 4c 3b 73 f8 cmp -0x8(%rbx),%r14
34: 72 a9 jb 0xffffffffffffffdf
36: 48 8d 7b 08 lea 0x8(%rbx),%rdi
3a: 48 89 f8 mov %rdi,%rax
3d: 48 rex.W
3e: c1 .byte 0xc1
3f: e8 .byte 0xe8
Code starting with the faulting instruction
===========================================
0: 0f 85 f1 01 00 00 jne 0x1f7
6: 4c 3b 73 f8 cmp -0x8(%rbx),%r14
a: 72 a9 jb 0xffffffffffffffb5
c: 48 8d 7b 08 lea 0x8(%rbx),%rdi
10: 48 89 f8 mov %rdi,%rax
13: 48 rex.W
14: c1 .byte 0xc1
15: e8 .byte 0xe8
[ 737.600980][ C63] RSP: 0018:ffa0000035d3f7c0 EFLAGS: 00000246
[ 737.608491][ C63] RAX: 1fe220043da2917a RBX: ff110021ed148bd8 RCX: ffffffff813cb361
[ 737.618177][ C63] RDX: 0000000000000000 RSI: 0000000000000004 RDI: ff110021ed148bd0
[ 737.627846][ C63] RBP: 000000000000005e R08: 0000000000000001 R09: fff3fc0006ba7eea
[ 737.637506][ C63] R10: 0000000000000003 R11: 00007fece9080fff R12: ffffffffc0801000
[ 737.647177][ C63] R13: ff1100010c992bc8 R14: ffffffffc0600000 R15: dffffc0000000000
[ 737.656846][ C63] FS: 00007fed2a754840(0000) GS:ff11003fcc180000(0000) knlGS:0000000000000000
[ 737.667580][ C63] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 737.675680][ C63] CR2: ff1100013ca00000 CR3: 0000002169956003 CR4: 0000000000f73ef0
[ 737.685338][ C63] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 737.694977][ C63] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400
[ 737.704624][ C63] PKRU: 55555554
[ 737.709273][ C63] Call Trace:
[ 737.713624][ C63] <IRQ>
[ 737.717447][ C63] ? watchdog_timer_fn (kernel/watchdog.c:762)
[ 737.723792][ C63] ? __pfx_watchdog_timer_fn (kernel/watchdog.c:677)
[ 737.730545][ C63] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1691 kernel/time/hrtimer.c:1755)
[ 737.737187][ C63] ? __pfx___hrtimer_run_queues (kernel/time/hrtimer.c:1725)
[ 737.744234][ C63] ? ktime_get_update_offsets_now (kernel/time/timekeeping.c:195 (discriminator 3) kernel/time/timekeeping.c:395 (discriminator 3) kernel/time/timekeeping.c:403 (discriminator 3) kernel/time/timekeeping.c:2449 (discriminator 3))
[ 737.751558][ C63] ? hrtimer_interrupt (kernel/time/hrtimer.c:1820)
[ 737.757924][ C63] ? __sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1038 arch/x86/kernel/apic/apic.c:1055)
[ 737.765354][ C63] ? sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1049 arch/x86/kernel/apic/apic.c:1049)
[ 737.772485][ C63] </IRQ>
[ 737.776390][ C63] <TASK>
[ 737.780307][ C63] ? asm_sysvec_apic_timer_interrupt (arch/x86/include/asm/idtentry.h:702)
[ 737.787823][ C63] ? 0xffffffffc0600000
[ 737.793079][ C63] ? do_raw_spin_lock (arch/x86/include/asm/atomic.h:107 include/linux/atomic/atomic-arch-fallback.h:2170 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 kernel/locking/spinlock_debug.c:116)
[ 737.799294][ C63] ? find_vmap_area_exceed_addr_lock (mm/vmalloc.c:1034 mm/vmalloc.c:1066)
[ 737.806842][ C63] ? 0xffffffffc0600000
[ 737.812040][ C63] ? 0xffffffffc0600000
[ 737.817228][ C63] vread_iter (mm/vmalloc.c:4354)
[ 737.822483][ C63] ? __pfx_fault_in_safe_writeable (mm/gup.c:2185)
[ 737.829690][ C63] ? __pfx_vread_iter (mm/vmalloc.c:4337)
[ 737.835622][ C63] ? 0xffffffffc0600000
[ 737.840761][ C63] ? 0xffffffffc0600000
[ 737.845890][ C63] read_kcore_iter (fs/proc/kcore.c:534)
[ 737.851801][ C63] ? 0xffffffffc0600000
[ 737.856897][ C63] ? __pfx_read_kcore_iter (fs/proc/kcore.c:325)
[ 737.863261][ C63] ? __filemap_add_folio (mm/filemap.c:943)
[ 737.869597][ C63] ? __pfx___filemap_add_folio (mm/filemap.c:852)
[ 737.876300][ C63] ? __pfx_workingset_update_node (mm/workingset.c:617)
[ 737.883288][ C63] ? preempt_count_add (include/linux/ftrace.h:976 kernel/sched/core.c:5777 kernel/sched/core.c:5774 kernel/sched/core.c:5802)
[ 737.889298][ C63] ? __folio_batch_add_and_move (arch/x86/include/asm/preempt.h:103 mm/swap.c:246)
[ 737.896253][ C63] ? preempt_count_add (include/linux/ftrace.h:976 kernel/sched/core.c:5777 kernel/sched/core.c:5774 kernel/sched/core.c:5802)
[ 737.902233][ C63] ? copy_page_from_iter_atomic (include/linux/highmem-internal.h:234 lib/iov_iter.c:484)
[ 737.909273][ C63] ? __vfs_getxattr (fs/xattr.c:419)
[ 737.914937][ C63] ? __pfx_copy_page_from_iter_atomic (lib/iov_iter.c:462)
[ 737.922254][ C63] ? simple_write_end (arch/x86/include/asm/atomic.h:67 include/linux/atomic/atomic-arch-fallback.h:2278 include/linux/atomic/atomic-instrumented.h:1384 include/linux/page_ref.h:205 include/linux/mm.h:1141 include/linux/mm.h:1146 include/linux/mm.h:1477 fs/libfs.c:985)
[ 737.928198][ C63] ? generic_perform_write (mm/filemap.c:4077)
[ 737.934609][ C63] ? __pfx___fsnotify_parent (fs/notify/fsnotify.c:216)
[ 737.941037][ C63] ? file_update_time (fs/inode.c:2272)
[ 737.946882][ C63] ? preempt_count_add (include/linux/ftrace.h:976 kernel/sched/core.c:5777 kernel/sched/core.c:5774 kernel/sched/core.c:5802)
[ 737.952802][ C63] proc_reg_read_iter (fs/proc/inode.c:299)
[ 737.958732][ C63] vfs_read (fs/read_write.c:488 fs/read_write.c:569)
[ 737.963662][ C63] ? __pfx_vfs_read (fs/read_write.c:550)
[ 737.969169][ C63] ? __asan_memset (mm/kasan/shadow.c:84)
[ 737.974591][ C63] ? preempt_count_add (include/linux/ftrace.h:976 kernel/sched/core.c:5777 kernel/sched/core.c:5774 kernel/sched/core.c:5802)
[ 737.980499][ C63] ? fdget_pos (arch/x86/include/asm/atomic64_64.h:15 include/linux/atomic/atomic-arch-fallback.h:2583 include/linux/atomic/atomic-long.h:38 include/linux/atomic/atomic-instrumented.h:3189 fs/file.c:1150 fs/file.c:1158)
[ 737.985704][ C63] ksys_read (fs/read_write.c:713)
[ 737.990625][ C63] ? __pfx_ksys_read (fs/read_write.c:702)
[ 737.996234][ C63] ? fpregs_assert_state_consistent (arch/x86/kernel/fpu/context.h:38 arch/x86/kernel/fpu/core.c:822)
[ 738.003307][ C63] do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
[ 738.008632][ C63] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
[ 738.015535][ C63] RIP: 0033:0x7fed2b67b19d
[ 738.020760][ C63] Code: 31 c0 e9 c6 fe ff ff 50 48 8d 3d 66 54 0a 00 e8 49 ff 01 00 66 0f 1f 84 00 00 00 00 00 80 3d 41 24 0e 00 00 74 17 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 5b c3 66 2e 0f 1f 84 00 00 00 00 00 48 83 ec
All code
========
0: 31 c0 xor %eax,%eax
2: e9 c6 fe ff ff jmp 0xfffffffffffffecd
7: 50 push %rax
8: 48 8d 3d 66 54 0a 00 lea 0xa5466(%rip),%rdi # 0xa5475
f: e8 49 ff 01 00 call 0x1ff5d
14: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
1b: 00 00
1d: 80 3d 41 24 0e 00 00 cmpb $0x0,0xe2441(%rip) # 0xe2465
24: 74 17 je 0x3d
26: 31 c0 xor %eax,%eax
28: 0f 05 syscall
2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
30: 77 5b ja 0x8d
32: c3 ret
33: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
3a: 00 00 00
3d: 48 rex.W
3e: 83 .byte 0x83
3f: ec in (%dx),%al
Code starting with the faulting instruction
===========================================
0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
6: 77 5b ja 0x63
8: c3 ret
9: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
10: 00 00 00
13: 48 rex.W
14: 83 .byte 0x83
15: ec in (%dx),%al
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20241212/202412122201.67d21c2b-lkp@xxxxxxxxx
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki