Re: [PATCH v3] workqueue: don't skip lockdep work dependency in cancel_work_sync()

From: Johannes Berg
Date: Wed Sep 13 2023 - 10:42:26 EST


Hi Guenter,

> This patch results in the attached lockdep splat when running the
> ast2600-evb emulation in qemu with aspeed_g5_defconfig and lock debugging
> enabled. Reverting this patch fixes the problem.

Umm ... That's only true if you think the problem is the lockdep splat,
rather than the actual potential deadlock?!

> [ 9.809960] ======================================================
> [ 9.810053] WARNING: possible circular locking dependency detected
> [ 9.810196] 6.6.0-rc1-00004-g6faca50f629f #1 Tainted: G N

I don't have this exact tree, but on 6.6-rc1,

> [ 9.810327] ------------------------------------------------------
> [ 9.810406] ip/357 is trying to acquire lock:
> [ 9.810501] 83af6c40 ((work_completion)(&(&dev->state_queue)->work)){+.+.}-{0:0}, at: __flush_work+0x40/0x550
> [ 9.811052]
> [ 9.811052] but task is already holding lock:
> [ 9.811133] 81639924 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x124/0x514
> [ 9.811264]
> [ 9.811264] which lock already depends on the new lock.
> [ 9.811264]
> [ 9.811361]
> [ 9.811361] the existing dependency chain (in reverse order) is:
> [ 9.811466]
> [ 9.811466] -> #1 (rtnl_mutex){+.+.}-{3:3}:
> [ 9.811616] lock_acquire+0xfc/0x368
> [ 9.811717] __mutex_lock+0x90/0xf00
> [ 9.811782] mutex_lock_nested+0x24/0x2c
> [ 9.811845] ftgmac100_reset+0x1c/0x1dc


This does indeed take the RTNL:

static void ftgmac100_reset(struct ftgmac100 *priv)
{
struct net_device *netdev = priv->netdev;
int err;

netdev_dbg(netdev, "Resetting NIC...\n");

/* Lock the world */
rtnl_lock();

and is called from

> [ 9.811907] ftgmac100_adjust_link+0xc0/0x13c
> [ 9.811972] phy_link_change+0x30/0x5c
> [ 9.812035] phy_check_link_status+0x9c/0x11c
> [ 9.812100] phy_state_machine+0x1c0/0x2c0

this work (phy_state_machine is the function), which

> [ 9.812405] -> #0 ((work_completion)(&(&dev->state_queue)->work)){+.+.}-{0:0}:
> [ 9.812531] check_prev_add+0x128/0x15ec
> [ 9.812594] __lock_acquire+0x16ec/0x20cc
> [ 9.812656] lock_acquire+0xfc/0x368
> [ 9.812712] __flush_work+0x70/0x550
> [ 9.812769] __cancel_work_timer+0x1e4/0x264
> [ 9.812833] phy_stop+0x78/0x128

is cancelled by phy_stop() in phy_stop_machine():

void phy_stop_machine(struct phy_device *phydev)
{
cancel_delayed_work_sync(&phydev->state_queue);

but of course that's called by the driver under RTNL:

> [ 9.812889] ftgmac100_stop+0x5c/0xac
> [ 9.812949] __dev_close_many+0xb8/0x140

(__dev_close_many requires RTNL)


So you have a potential deadlock in this driver. Yes, workqueue items
and RTNL are basically incompatible. Don't do that. Now this bug was
_probably_ added by commit 1baf2e50e48f ("drivers/net/ftgmac100: fix
DHCP potential failure with systemd") which added a call to
ftgmac100_reset() in ftgmac100_adjust_link() which is the thing called
from the PHY state machine in the first place.

Should that be reverted? I don't know ... maybe it can be fixed
differently.


But anyway ... as far as lockdep/workqueue stuff is concerned I'd
definitely call it a win rather than a bug! Yay for making lockdep
useful - it found a deadlock situation for you! :-) No need to blame
lockdep for that :P

johannes