Re: [WARNING] RCU stall in sock_def_readable()

From: Steven Rostedt

Date: Fri Apr 17 2026 - 08:44:25 EST


On Thu, 16 Apr 2026 17:16:11 -0700
"Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote:

> One "hail Mary" thought is to revert this guy and see if it helps:
>
> d41e37f26b31 ("rcu: Fix rcu_read_unlock() deadloop due to softirq")
>
> This commit fixes a bug, so we cannot revert it in mainline, but there
> is some reason to believe that there are more bugs beyond the one that
> it fixed, and it might have (through no fault of its own) made those
> other bugs more probable.
>
> Worth a try, anyway!

Hail mary's are worth a try, but the reason they call it a hail mary is
because it is unlikely to succeed :-p

run test ssh -t root@tracetest "trace-cmd record -p function -e syscalls /work/c/hackbench_64 50"
ssh -t root@tracetest "trace-cmd record -p function -e syscalls /work/c/hackbench_64 50" ... [ 209.590500] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[ 209.592620] rcu: Tasks blocked on level-0 rcu_node (CPUs 0-3): P3151/1:b..l
[ 209.595266] rcu: (detected by 0, t=6502 jiffies, g=29673, q=186 ncpus=4)
[ 209.597557] task:hackbench_64 state:R running task stack:0 pid:3151 tgid:3151 ppid:3144 task_flags:0x400000 flags:0x00080000
[ 209.601871] Call Trace:
[ 209.602852] <TASK>
[ 209.603752] __schedule+0x4ac/0x12f0
[ 209.605172] preempt_schedule_common+0x26/0xe0
[ 209.606755] ? preempt_schedule_thunk+0x16/0x30
[ 209.608337] preempt_schedule_thunk+0x16/0x30
[ 209.609973] ? _raw_spin_unlock_irqrestore+0x39/0x70
[ 209.611688] _raw_spin_unlock_irqrestore+0x5d/0x70
[ 209.613408] sock_def_readable+0x9c/0x2b0
[ 209.614841] unix_stream_sendmsg+0x2d7/0x710
[ 209.616420] sock_write_iter+0x185/0x190
[ 209.617934] vfs_write+0x457/0x5b0
[ 209.619242] ksys_write+0xc8/0xf0
[ 209.620532] do_syscall_64+0x117/0x1660
[ 209.621936] ? irqentry_exit+0xd9/0x690
[ 209.623319] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 209.625199] RIP: 0033:0x7f603e8e5190
[ 209.626628] RSP: 002b:00007ffd003f99c8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[ 209.629304] RAX: ffffffffffffffda RBX: 00007ffd003f9b58 RCX: 00007f603e8e5190
[ 209.631710] RDX: 0000000000000001 RSI: 00007ffd003f99ef RDI: 0000000000000006
[ 209.634200] RBP: 00007ffd003f9a40 R08: 0011861580000000 R09: 0000000000000000
[ 209.636638] R10: 00007f603e8064d0 R11: 0000000000000202 R12: 0000000000000000
[ 209.639050] R13: 00007ffd003f9b70 R14: 00005637df126dd8 R15: 00007f603ea10020
[ 209.641600] </TASK>
Detected kernel crash!


That was with the revert :-(

-- Steve