Re: Hung tasks due to a AB-BA deadlock between the leds_list_lock rwsem and the rtnl mutex

From: Andrew Lunn
Date: Fri May 31 2024 - 08:55:00 EST


> I actually have been looking at a ledtrig-netdev lockdep warning yesterday
> which I believe is the same thing. I'll include the lockdep trace below.
>
> According to lockdep there indeed is a ABBA (ish) cyclic deadlock with
> the rtnl mutex vs led-triggers related locks. I believe that this problem
> may be a pre-existing problem but this now actually gets hit in kernels >=
> 6.9 because of commit 66601a29bb23 ("leds: class: If no default trigger is
> given, make hw_control trigger the default trigger"). Before that commit
> the "netdev" trigger would not be bound / set as phy LEDs trigger by default.
>
> +Cc Heiner Kallweit who authored that commit.
>
> The netdev trigger typically is not needed because the PHY LEDs are typically
> under hw-control and the netdev trigger even tries to leave things that way
> so setting it as the active trigger for the LED class device is basically
> a no-op. I guess the goal of that commit is correctly have the triggers
> file content reflect that the LED is controlled by a netdev and to allow
> changing the hw-control mode without the user first needing to set netdev
> as trigger before being able to change the mode.

It was not the intention that this triggers is loaded for all
systems. It should only be those that actually have LEDs which can be
controlled:

drivers/net/ethernet/realtek/r8169_leds.c: led_cdev->hw_control_trigger = "netdev";
drivers/net/ethernet/realtek/r8169_leds.c: led_cdev->hw_control_trigger = "netdev";
drivers/net/ethernet/intel/igc/igc_leds.c: led_cdev->hw_control_trigger = "netdev";
drivers/net/dsa/qca/qca8k-leds.c: port_led->cdev.hw_control_trigger = "netdev";
drivers/net/phy/phy_device.c: cdev->hw_control_trigger = "netdev";

Reverting this patch does seem like a good way forward, but i would
also like to give Heiner a little bit of time to see if he has a quick
real fix.

Andrew