Re: Hung tasks due to a AB-BA deadlock between the leds_list_lock rwsem and the rtnl mutex
From: Hans de Goede
Date: Fri May 31 2024 - 05:50:33 EST
Hi,
On 5/31/24 10:39 AM, Linux regression tracking (Thorsten Leemhuis) wrote:
> [adding the LED folks and the regressions list to the list of recipients]
>
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> Lee, Pavel, could you look into below regression report please? Thread
> starts here:
> https://lore.kernel.org/all/9d189ec329cfe68ed68699f314e191a10d4b5eda.camel@xxxxxxxxxxxx/
>
> Another report with somewhat similar symptom can be found here:
> https://lore.kernel.org/lkml/e441605c-eaf2-4c2d-872b-d8e541f4cf60@xxxxxxxxx/
>
> See also Russell's analysis of that report below (many many thx for
> that, much appreciated Russel!).
>
> To my untrained eyes all of this sounds a lot like we still have a 6.9
> regression related to the LED code somewhere. Reminder, we had earlier
> trouble, but that was avoided through other measures:
>
> * 3d913719df14c2 ("wifi: iwlwifi: Use request_module_nowait") /
> https://lore.kernel.org/lkml/30f757e3-73c5-5473-c1f8-328bab98fd7d@xxxxxxxxxxxxxxx/
>
> * c04d1b9ecce565 ("igc: Fix LED-related deadlock on driver unbind") /
> https://lore.kernel.org/all/ZhRD3cOtz5i-61PB@mail-itl/
>
> * 19fa4f2a85d777 ("r8169: fix LED-related deadlock on module removal")
>
> That iwlwifi commit even calls it self "work around". The developer that
> submitted it bisected the problem to a LED merge, but sadly that was the
> end of it. :-/
I actually have been looking at a ledtrig-netdev lockdep warning yesterday
which I believe is the same thing. I'll include the lockdep trace below.
According to lockdep there indeed is a ABBA (ish) cyclic deadlock with
the rtnl mutex vs led-triggers related locks. I believe that this problem
may be a pre-existing problem but this now actually gets hit in kernels >=
6.9 because of commit 66601a29bb23 ("leds: class: If no default trigger is
given, make hw_control trigger the default trigger"). Before that commit
the "netdev" trigger would not be bound / set as phy LEDs trigger by default.
+Cc Heiner Kallweit who authored that commit.
The netdev trigger typically is not needed because the PHY LEDs are typically
under hw-control and the netdev trigger even tries to leave things that way
so setting it as the active trigger for the LED class device is basically
a no-op. I guess the goal of that commit is correctly have the triggers
file content reflect that the LED is controlled by a netdev and to allow
changing the hw-control mode without the user first needing to set netdev
as trigger before being able to change the mode.
But there is a price to this, besides the locking problem this also
causes the ledtrig-netdev module to load on pretty much everyones
systems (when build as a module) even though 99.999% of our users
likely does not need this at all...
Given this price and the troubles this is causing I think it might be best
to revert 66601a29bb23. There might still be a locking issue when setting
the trigger to netdev manually (I'll check and follow up) but this should
fix the regression users are hitting since typically users do not set
the trigger manually.
Gene, as the original reporter of this can you do "modinfo ledtrig_netdev"
and if this shows that ledtrig_netdev is a module for you try blacklisting
ledtrig_netdev ? And if it is not a module can you try building a 6.9
kernel with commit 66601a29bb23 reverted and see if that helps ?
Regards,
Hans
Here is the promised lockdep report:
[ 73.959672] ======================================================
[ 73.959675] WARNING: possible circular locking dependency detected
[ 73.959677] 6.10.0-rc1+ #94 Not tainted
[ 73.959680] ------------------------------------------------------
[ 73.959682] NetworkManager/1815 is trying to acquire lock:
[ 73.959685] ffffffffb2145170 (triggers_list_lock){++++}-{3:3}, at: led_trigger_register+0x40/0x1b0
[ 73.959695]
but task is already holding lock:
[ 73.959697] ffffffffb2158fe8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x133/0x630
[ 73.959706]
which lock already depends on the new lock.
[ 73.959708]
the existing dependency chain (in reverse order) is:
[ 73.959710]
-> #2 (rtnl_mutex){+.+.}-{3:3}:
[ 73.959716] __mutex_lock+0x8c/0xc10
[ 73.959720] set_device_name+0x2d/0x140 [ledtrig_netdev]
[ 73.959725] netdev_trig_activate+0x197/0x210 [ledtrig_netdev]
[ 73.959728] led_trigger_set+0x1e1/0x2e0
[ 73.959731] led_trigger_register+0x170/0x1b0
[ 73.959734] do_one_initcall+0x5e/0x3a0
[ 73.959738] do_init_module+0x60/0x220
[ 73.959741] __do_sys_init_module+0x15f/0x190
[ 73.959745] do_syscall_64+0x93/0x180
[ 73.959748] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 73.959752]
-> #1 (&led_cdev->trigger_lock){+.+.}-{3:3}:
[ 73.959758] down_write+0x3b/0xd0
[ 73.959761] led_trigger_set_default+0x34/0xe0
[ 73.959764] led_classdev_register_ext+0x311/0x3a0
[ 73.959767] input_leds_connect+0x139/0x260
[ 73.959770] input_attach_handler.isra.0+0x75/0x90
[ 73.959773] input_register_device.cold+0xa1/0x150
[ 73.959776] hidinput_connect+0x848/0xb00
[ 73.959779] hid_connect+0x567/0x5a0
[ 73.959783] hid_hw_start+0x3f/0x60
[ 73.959787] hid_device_probe+0x10d/0x190
[ 73.959789] really_probe+0xde/0x340
[ 73.959793] __driver_probe_device+0x78/0x110
[ 73.959796] driver_probe_device+0x1f/0xa0
[ 73.959798] __device_attach_driver+0x85/0x110
[ 73.959801] bus_for_each_drv+0x78/0xc0
[ 73.959805] __device_attach+0xb0/0x1b0
[ 73.959808] bus_probe_device+0x94/0xb0
[ 73.959810] device_add+0x64a/0x860
[ 73.959814] hid_add_device+0xe5/0x240
[ 73.959817] usbhid_probe+0x4bb/0x600
[ 73.959821] usb_probe_interface+0xea/0x2b0
[ 73.959824] really_probe+0xde/0x340
[ 73.959827] __driver_probe_device+0x78/0x110
[ 73.959830] driver_probe_device+0x1f/0xa0
[ 73.959833] __device_attach_driver+0x85/0x110
[ 73.959835] bus_for_each_drv+0x78/0xc0
[ 73.959839] __device_attach+0xb0/0x1b0
[ 73.959842] bus_probe_device+0x94/0xb0
[ 73.959844] device_add+0x64a/0x860
[ 73.959847] usb_set_configuration+0x5e8/0x880
[ 73.959850] usb_generic_driver_probe+0x3e/0x60
[ 73.959854] usb_probe_device+0x3d/0x120
[ 73.959857] really_probe+0xde/0x340
[ 73.959859] __driver_probe_device+0x78/0x110
[ 73.959862] driver_probe_device+0x1f/0xa0
[ 73.959865] __device_attach_driver+0x85/0x110
[ 73.959868] bus_for_each_drv+0x78/0xc0
[ 73.959871] __device_attach+0xb0/0x1b0
[ 73.959874] bus_probe_device+0x94/0xb0
[ 73.959876] device_add+0x64a/0x860
[ 73.959880] usb_new_device.cold+0x141/0x38f
[ 73.959883] hub_event+0x1166/0x1980
[ 73.959886] process_one_work+0x21a/0x590
[ 73.959889] worker_thread+0x1d1/0x3e0
[ 73.959891] kthread+0xee/0x120
[ 73.959895] ret_from_fork+0x30/0x50
[ 73.959898] ret_from_fork_asm+0x1a/0x30
[ 73.959902]
-> #0 (triggers_list_lock){++++}-{3:3}:
[ 73.959908] __lock_acquire+0x11c6/0x1f20
[ 73.959911] lock_acquire+0xc8/0x2b0
[ 73.959914] down_write+0x3b/0xd0
[ 73.959916] led_trigger_register+0x40/0x1b0
[ 73.959919] phy_led_triggers_register+0xb4/0x240
[ 73.959923] phy_attach_direct+0x378/0x380
[ 73.959926] phy_connect_direct+0x21/0x70
[ 73.959929] rtl_open+0x2e1/0x490 [r8169]
[ 73.959934] __dev_open+0xea/0x1b0
[ 73.959937] __dev_change_flags+0x201/0x240
[ 73.959939] dev_change_flags+0x22/0x60
[ 73.959942] do_setlink+0x32d/0x1070
[ 73.959945] __rtnl_newlink+0x516/0x9d0
[ 73.959948] rtnl_newlink+0x43/0x70
[ 73.959951] rtnetlink_rcv_msg+0x159/0x630
[ 73.959953] netlink_rcv_skb+0x4f/0x100
[ 73.959956] netlink_unicast+0x18c/0x260
[ 73.959959] netlink_sendmsg+0x207/0x420
[ 73.959962] ____sys_sendmsg+0x364/0x3a0
[ 73.959965] ___sys_sendmsg+0x84/0xd0
[ 73.959967] __sys_sendmsg+0x8e/0xd0
[ 73.959971] do_syscall_64+0x93/0x180
[ 73.959974] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 73.959977]
other info that might help us debug this:
[ 73.959979] Chain exists of:
triggers_list_lock --> &led_cdev->trigger_lock --> rtnl_mutex
[ 73.959987] Possible unsafe locking scenario:
[ 73.959989] CPU0 CPU1
[ 73.959991] ---- ----
[ 73.959993] lock(rtnl_mutex);
[ 73.959996] lock(&led_cdev->trigger_lock);
[ 73.960000] lock(rtnl_mutex);
[ 73.960004] lock(triggers_list_lock);
[ 73.960007]
*** DEADLOCK ***
[ 73.960009] 1 lock held by NetworkManager/1815:
[ 73.960012] #0: ffffffffb2158fe8 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x133/0x630
[ 73.960020]
stack backtrace:
[ 73.960023] CPU: 5 PID: 1815 Comm: NetworkManager Not tainted 6.10.0-rc1+ #94
[ 73.960026] Hardware name: Micro-Star International Co., Ltd. MS-7C95/B550M PRO-VDH WIFI (MS-7C95), BIOS 2.J0 01/02/2024
[ 73.960029] Call Trace:
[ 73.960031] <TASK>
[ 73.960034] dump_stack_lvl+0x68/0x90
[ 73.960039] check_noncircular+0x10d/0x120
[ 73.960044] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960048] ? __lock_acquire+0xc19/0x1f20
[ 73.960055] __lock_acquire+0x11c6/0x1f20
[ 73.960062] lock_acquire+0xc8/0x2b0
[ 73.960065] ? led_trigger_register+0x40/0x1b0
[ 73.960073] down_write+0x3b/0xd0
[ 73.960076] ? led_trigger_register+0x40/0x1b0
[ 73.960079] led_trigger_register+0x40/0x1b0
[ 73.960083] phy_led_triggers_register+0xb4/0x240
[ 73.960089] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960091] ? schedule_timeout+0xc1/0x1b0
[ 73.960095] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960097] ? find_held_lock+0x2b/0x80
[ 73.960101] ? phy_attach_direct+0x1b0/0x380
[ 73.960105] ? phy_attach_direct+0x1b0/0x380
[ 73.960108] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960110] ? lock_release+0x169/0x2b0
[ 73.960115] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960118] ? __mutex_unlock_slowpath+0x2d/0x260
[ 73.960124] phy_attach_direct+0x378/0x380
[ 73.960128] ? __pfx_r8169_phylink_handler+0x10/0x10 [r8169]
[ 73.960134] phy_connect_direct+0x21/0x70
[ 73.960139] rtl_open+0x2e1/0x490 [r8169]
[ 73.960147] __dev_open+0xea/0x1b0
[ 73.960152] __dev_change_flags+0x201/0x240
[ 73.960155] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960160] dev_change_flags+0x22/0x60
[ 73.960165] do_setlink+0x32d/0x1070
[ 73.960172] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960175] ? __lock_acquire+0x3ce/0x1f20
[ 73.960178] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960180] ? __nla_validate_parse+0x52/0xc70
[ 73.960185] ? __entry_text_end+0x1025c9/0x1025cd
[ 73.960190] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960193] ? __lock_acquire+0x3ce/0x1f20
[ 73.960199] __rtnl_newlink+0x516/0x9d0
[ 73.960204] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960211] ? __entry_text_end+0x1025c9/0x1025cd
[ 73.960214] ? rcu_is_watching+0xd/0x40
[ 73.960217] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960220] ? kmalloc_trace_noprof+0x25c/0x320
[ 73.960224] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960230] rtnl_newlink+0x43/0x70
[ 73.960235] rtnetlink_rcv_msg+0x159/0x630
[ 73.960241] ? __pfx_rtnetlink_rcv_msg+0x10/0x10
[ 73.960246] netlink_rcv_skb+0x4f/0x100
[ 73.960257] netlink_unicast+0x18c/0x260
[ 73.960262] netlink_sendmsg+0x207/0x420
[ 73.960270] ____sys_sendmsg+0x364/0x3a0
[ 73.960273] ? import_iovec+0x17/0x20
[ 73.960277] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960280] ? copy_msghdr_from_user+0xd9/0x150
[ 73.960286] ___sys_sendmsg+0x84/0xd0
[ 73.960292] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960294] ? __lock_acquire+0x3ce/0x1f20
[ 73.960299] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960304] ? find_held_lock+0x2b/0x80
[ 73.960307] ? __fget_files+0xc3/0x190
[ 73.960312] ? __fget_files+0xc3/0x190
[ 73.960315] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960317] ? lock_release+0x169/0x2b0
[ 73.960322] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960327] __sys_sendmsg+0x8e/0xd0
[ 73.960337] do_syscall_64+0x93/0x180
[ 73.960340] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960345] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960349] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960352] ? find_held_lock+0x2b/0x80
[ 73.960356] ? __memcg_slab_free_hook+0x56/0x230
[ 73.960360] ? __memcg_slab_free_hook+0x56/0x230
[ 73.960363] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960366] ? lock_release+0x169/0x2b0
[ 73.960370] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960373] ? __memcg_slab_free_hook+0x60/0x230
[ 73.960378] ? kmem_cache_free+0x144/0x430
[ 73.960382] ? syscall_exit_to_user_mode+0x11/0x280
[ 73.960387] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960390] ? do_syscall_64+0x9f/0x180
[ 73.960393] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960396] ? lockdep_hardirqs_on+0x78/0x100
[ 73.960400] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960403] ? do_syscall_64+0x9f/0x180
[ 73.960406] ? fd_install+0xba/0x310
[ 73.960411] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960414] ? kmem_cache_free+0x144/0x430
[ 73.960420] ? do_sys_openat2+0x64/0xb0
[ 73.960423] ? syscall_exit_to_user_mode+0x11/0x280
[ 73.960428] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960431] ? do_syscall_64+0x9f/0x180
[ 73.960435] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960437] ? lockdep_hardirqs_on+0x78/0x100
[ 73.960441] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960444] ? do_syscall_64+0x9f/0x180
[ 73.960448] ? srso_alias_return_thunk+0x5/0xfbef5
[ 73.960453] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 73.960457] RIP: 0033:0x7f857a9db79b
[ 73.960461] Code: 48 89 e5 48 83 ec 20 89 55 ec 48 89 75 f0 89 7d f8 e8 49 7a f7 ff 8b 55 ec 48 8b 75 f0 41 89 c0 8b 7d f8 b8 2e 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 2d 44 89 c7 48 89 45 f8 e8 a1 7a f7 ff 48 8b
[ 73.960464] RSP: 002b:00007ffcf5d3a660 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[ 73.960468] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f857a9db79b
[ 73.960471] RDX: 0000000000000000 RSI: 00007ffcf5d3a6a0 RDI: 000000000000000d
[ 73.960473] RBP: 00007ffcf5d3a680 R08: 0000000000000000 R09: 0000000000000000
[ 73.960476] R10: 0000000000000000 R11: 0000000000000293 R12: 00005632de03d9d0
[ 73.960478] R13: 000000000000000a R14: 00007ffcf5d3a83c R15: 0000000000000000
[ 73.960488] </TASK>
[ 73.960561] Generic FE-GE Realtek PHY r8169-0-2a00:00: attached PHY driver (mii_bus:phy_addr=r8169-0-2a00:00, irq=MAC)
> On 30.05.24 16:04, Russell King (Oracle) wrote:
>> On Thu, May 30, 2024 at 09:36:45AM -0400, Genes Lists wrote:
>>> On Thu, 2024-05-30 at 08:53 -0400, Genes Lists wrote:
>>> This report for 6.9.1 could well be the same issue:
>>> https://lore.kernel.org/lkml/e441605c-eaf2-4c2d-872b-d8e541f4cf60@xxxxxxxxx/
>>
>> The reg_check_chans_work() thing in pid 285 is likely stuck on the
>> rtnl lock. The same is true of pid 287.
>>
>> That will be because of the thread (pid 663) that's stuck in
>> __dev_open()...led_trigger_register(), where the rtnl lock will have
>> been taken in that path. It looks to me like led_trigger_register()
>> is stuck waiting for read access with the leds_list_lock rwsem.
>>
>> There are only two places that take that rwsem in write mode, which
>> are led_classdev_register_ext() and led_classdev_unregister(). None
>> of these paths are blocking in v6.9.
>>
>> Pid 641 doesn't look significant (its probably waiting for either
>> pid 285 or 287 to complete its work.)
>>
>> Pid 666 looks like it is blocked waiting for exclusive write-access
>> on the leds_list_lock - but it isn't holding that lock. This means
>> there must already be some other reader or writer holding this lock.
>>
>> Pid 722 doesn't look sigificant (same as pid 641).
>>
>> Pid 760 is also waiting for the rtnl lock.
>>
>> Pid 854, 855 also doesn't look sigificant (as pid 641).
>>
>> And then we get to pid 858. This is in set_device_name(), which
>> was called from led_trigger_set() and led_trigger_register().
>> We know from pid 663 that led_trigger_register() can take a read
>> on leds_list_lock, and indeed it does and then calls
>> led_match_default_trigger(), which then goes on to call
>> led_trigger_set(). Bingo, this is why pid 666 is blocked, which
>> then blocks pid 663. pid 663 takes the rtnl lock, which blocks
>> everything else _and_ also blocks pid 858 in set_device_name().
>>
>> Lockdep would've found this... this is a classic AB-BA deadlock
>> between the leds_list_lock rwsem and the rtnl mutex.
>>
>> I haven't checked to see how that deadlock got introduced, that's
>> for someone else to do.
>
> P.S.:
>
> #regzbot report: /
> #regzbot introduced: f5c31bcf604d
> #regzbot duplicate:
> https://lore.kernel.org/lkml/e441605c-eaf2-4c2d-872b-d8e541f4cf60@xxxxxxxxx/
> #regzbot summary: leds: Hung tasks due to a AB-BA deadlock between the
> leds_list_lock rwsem and the rtnl mutex
>