Re: [PATCHv2 net] usbnet: fix cyclical race on disconnect with work queue

From: Paolo Abeni
Date: Tue Sep 10 2024 - 05:59:13 EST




On 9/5/24 15:46, Oliver Neukum wrote:
The work can submit URBs and the URBs can schedule the work.
This cycle needs to be broken, when a device is to be stopped.
Use a flag to do so.
This is a design issue as old as the driver.

Signed-off-by: Oliver Neukum <oneukum@xxxxxxxx>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
CC: stable@xxxxxxxxxxxxxxx
---

v2: fix PM reference issue

drivers/net/usb/usbnet.c | 37 ++++++++++++++++++++++++++++---------
include/linux/usb/usbnet.h | 17 +++++++++++++++++
2 files changed, 45 insertions(+), 9 deletions(-)

diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index 18eb5ba436df..2506aa8c603e 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -464,10 +464,15 @@ static enum skb_state defer_bh(struct usbnet *dev, struct sk_buff *skb,
void usbnet_defer_kevent (struct usbnet *dev, int work)
{
set_bit (work, &dev->flags);
- if (!schedule_work (&dev->kevent))
- netdev_dbg(dev->net, "kevent %s may have been dropped\n", usbnet_event_names[work]);
- else
- netdev_dbg(dev->net, "kevent %s scheduled\n", usbnet_event_names[work]);
+ if (!usbnet_going_away(dev)) {
+ if (!schedule_work(&dev->kevent))
+ netdev_dbg(dev->net,
+ "kevent %s may have been dropped\n",
+ usbnet_event_names[work]);
+ else
+ netdev_dbg(dev->net,
+ "kevent %s scheduled\n", usbnet_event_names[work]);
+ }
}
EXPORT_SYMBOL_GPL(usbnet_defer_kevent);
@@ -535,7 +540,8 @@ static int rx_submit (struct usbnet *dev, struct urb *urb, gfp_t flags)
tasklet_schedule (&dev->bh);
break;
case 0:
- __usbnet_queue_skb(&dev->rxq, skb, rx_start);
+ if (!usbnet_going_away(dev))
+ __usbnet_queue_skb(&dev->rxq, skb, rx_start);
}
} else {
netif_dbg(dev, ifdown, dev->net, "rx: stopped\n");
@@ -843,9 +849,18 @@ int usbnet_stop (struct net_device *net)
/* deferred work (timer, softirq, task) must also stop */
dev->flags = 0;
- del_timer_sync (&dev->delay);
- tasklet_kill (&dev->bh);
+ del_timer_sync(&dev->delay);
+ tasklet_kill(&dev->bh);
cancel_work_sync(&dev->kevent);
+
+ /* We have cyclic dependencies. Those calls are needed
+ * to break a cycle. We cannot fall into the gaps because
+ * we have a flag
+ */
+ tasklet_kill(&dev->bh);
+ del_timer_sync(&dev->delay);
+ cancel_work_sync(&dev->kevent);

I guess you do the shutdown twice because a running tasklet or timer could re-schedule the others? If so, what prevent the rescheduling to happen in the 2nd iteration? why can't you add usbnet_going_away() checks on tasklet and timer reschedule point?

Thanks,

Paolo