Summary: The introduction of async reboot in commit 8064952c6504
("driver core: shut down devices asynchronously") leads to frequent hangs on
shutdown even after commit 4f2c346e6216 ("driver core: fix async device shutdown hang")
is introduced.
I did some further experimenting (and lots of reboots ...) and found out that
the bug is preemption related, for me it only occurs when using CONFIG_PREEMPT=y
or CONFIG_PREEMPT_RT=y. When using CONFIG_PREEMPT_NONE=y or
CONFIG_PREEMPT_VOLUNTARY=y everything works fine.
Test results (linux-next-20240925):
PREEMPT_NONE 20 reboots, no fail
PREEMPT_VOLUNTARY 20 reboots, no fail
PREEMPT 3 reboots, 4th reboot failed
PREEMPT_RT 2 reboots, 3rd reboot failed
The behaviour can be improved by increasing the number of min_active items
in the async workqueue:
T115;4 locks held by kworker/7:2/343:
T115; #0: ffff91ea00050d48 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x4a4/0x580
T115; #1: ffffbaf182e07e58 ((work_completion)(&helper->damage_work)){+.+.}-{0:0}, at: process_one_work+0x1c7/0x580
T115; #2: ffffbaf182e07d00 (crtc_ww_class_acquire){+.+.}-{0:0}, at: drm_atomic_helper_dirtyfb+0x47/0x280
T115; #3: ffff91ea13b80528 (crtc_ww_class_mutex){+.+.}-{3:3}, at: modeset_lock+0xbf/0x1b0