Re: Deadlock in usb subsystem on shutdown, 6.18.3+
From: Ben Greear
Date: Wed Jan 14 2026 - 09:36:52 EST
On 1/13/26 18:45, Hillf Danton wrote:
On Tue, 13 Jan 2026 16:21:07 -0800 Ben Greear wrote:
Hello,
We caught a deadlock that appears to be in the USB code during shutdown.
We do a lot of reboots and normally all goes well, so I don't think we
can reliably reproduce the problem.
INFO: task systemd-shutdow:1 blocked for more than 180 seconds.
Tainted: G S O 6.18.3+ #33
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:systemd-shutdow state:D stack:0 pid:1 tgid:1 ppid:0 task_flags:0x400100 flags:0x00080001
Call Trace:
<TASK>
__schedule+0x46b/0x1140
schedule+0x23/0xc0
schedule_preempt_disabled+0x11/0x20
__mutex_lock.constprop.0+0x4f7/0x9a0
device_shutdown+0xa0/0x220
kernel_restart+0x36/0x90
__do_sys_reboot+0x127/0x220
? do_writev+0x76/0x110
? do_writev+0x76/0x110
do_syscall_64+0x50/0x6d0
entry_SYSCALL_64_after_hwframe+0x4b/0x53
RIP: 0033:0x7fad03531087
RSP: 002b:00007ffe137cf918 EFLAGS: 00000246 ORIG_RAX: 00000000000000a9
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fad03531087
RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead
RBP: 00007ffe137cfac0 R08: 0000000000000069 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
INFO: task systemd-shutdow:1 is blocked on a mutex likely owned by task kworker/4:1:16648.
This explains why the shutdown stalled.
INFO: task kworker/4:2:1520 blocked for more than 360 seconds.
Tainted: G S O 6.18.3+ #33
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/4:2 state:D stack:0 pid:1520 tgid:1520 ppid:2 task_flags:0x4288060 flags:0x00080000
Workqueue: events __usb_queue_reset_device
Call Trace:
<TASK>
__schedule+0x46b/0x1140
? schedule_timeout+0x79/0xf0
schedule+0x23/0xc0
usb_kill_urb+0x7b/0xc0
? housekeeping_affine+0x30/0x30
usb_start_wait_urb+0xd6/0x160
usb_control_msg+0xe2/0x140
hub_port_init+0x647/0xf70
usb_reset_and_verify_device+0x191/0x4a0
? device_release_driver_internal+0x4a/0x200
usb_reset_device+0x138/0x280
__usb_queue_reset_device+0x35/0x50
process_one_work+0x17e/0x390
worker_thread+0x2c8/0x3e0
? process_one_work+0x390/0x390
kthread+0xf7/0x1f0
? kthreads_online_cpu+0x100/0x100
? kthreads_online_cpu+0x100/0x100
ret_from_fork+0x114/0x140
? kthreads_online_cpu+0x100/0x100
ret_from_fork_asm+0x11/0x20
</TASK>
INFO: task kworker/4:1:16648 blocked for more than 360 seconds.
Tainted: G S O 6.18.3+ #33
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/4:1 state:D stack:0 pid:16648 tgid:16648 ppid:2 task_flags:0x4288060 flags:0x00080000
Workqueue: events __usb_queue_reset_device
Call Trace:
<TASK>
__schedule+0x46b/0x1140
schedule+0x23/0xc0
usb_kill_urb+0x7b/0xc0
Kworker failed to kill urb within 300 seconds, so we know the underlying usb
hardware failed to response within 300s.
That said, the deadlock in the subject line is incorrect, but task hung due
to hardware glitch.
In the case where hardware is not responding, shouldn't we just consider it
dead and move on instead of deadlocking the whole OS?
In this case, the system was un-plugged from a KVM (usb mouse & keyboard)
right around time of shutdown, so I guess that would explain why the USB device
didn't respond.
Thanks,
Ben
--
Ben Greear <greearb@xxxxxxxxxxxxxxx>
Candela Technologies Inc http://www.candelatech.com