BUG: soft lockup in run_timer_softirq
From: Zw Tang
Date: Tue Apr 21 2026 - 08:38:57 EST
Hi,
I am reporting a soft lockup issue triggered by a syzkaller reproducer on
Linux 7.0.0-08391-g1d51b370a0f8.
The system gets stuck with CPU#1 spinning in the timer softirq path for more
than 22 seconds. The lockup is observed in run_timer_softirq(), with the
interrupt stack showing the CPU stuck in __run_timer_base.part.0(). The
reproducer repeatedly invokes perf_event_open() with crafted
perf_event_attr values, so this looks like a timer softirq stall likely
triggered by the perf event subsystem.
Reproducer:
C reproducer: pastebin.com/raw/Rghf0nF6
console output: pastebin.com/raw/rWkVMZF2
kernel config: pastebin.com/raw/NSpXqAz9
Kernel:
HEAD commit: 1d51b370a0f8
git tree:
kernel version: 7.0.0-08391-g1d51b370a0f8 #1 PREEMPT(lazy) (QEMU)
The observed crash log is:
watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [repro:684]
CPU: 1 UID: 0 PID: 684 Comm: repro Tainted: G B D W N
7.0.0-08391-g1d51b370a0f8 #1 PREEMPT(lazy)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
1.16.3-debian-1.16.3-2 04/01/2014
RIP: 0010:_raw_spin_unlock_irq+0x2b/0x40
Call Trace:
__run_timer_base.part.0+0x6e4/0xb10
? call_timer_fn+0x590/0x590
? __sysvec_apic_timer_interrupt+0x2d6/0x3a0
run_timer_softirq+0x16c/0x1e0
handle_softirqs+0x195/0x7f0
? __hrtimer_rearm_deferred+0x1fd/0x580
irq_exit_rcu+0x141/0x1c0
sysvec_apic_timer_interrupt+0x70/0x80
asm_sysvec_apic_timer_interrupt+0x1a/0x20
finish_task_switch+0x1e5/0x8e0
__schedule+0x11b9/0x4a20
preempt_schedule_common+0x2f/0x50
preempt_schedule_thunk+0x16/0x30
_raw_spin_unlock+0x2e/0x30
do_wp_page+0xb4a/0x3c60
__handle_mm_fault+0x1ac9/0x2ab0
handle_mm_fault+0x40f/0xd90
do_user_addr_fault+0x6be/0x1970
exc_page_fault+0xb0/0x180
asm_exc_page_fault+0x26/0x30
The reproducer mainly performs repeated perf_event_open() calls with unusual
attribute combinations. Based on the stack, the visible stall point is in the
timer softirq path rather than in the userspace fault path. My current guess is
that some timer callback associated with perf events gets the CPU stuck inside
__run_timer_base.part.0().
Please let me know if I should provide a minimized reproducer or any additional
debugging information.
Thanks.