Re: [PATCHv2 net] usbnet: fix cyclical race on disconnect with work queue

From: Jakub Kicinski
Date: Tue Sep 10 2024 - 18:44:23 EST


On Thu, 5 Sep 2024 15:46:50 +0200 Oliver Neukum wrote:
> +static inline bool usbnet_going_away(struct usbnet *ubn)
> +{
> + smp_mb__before_atomic(); /* against usbnet_mark_going_away() */
> + return test_bit(EVENT_UNPLUG, &ubn->flags);
> +}
> +
> +static inline void usbnet_mark_going_away(struct usbnet *ubn)
> +{
> + set_bit(EVENT_UNPLUG, &ubn->flags);
> + smp_mb__after_atomic(); /* against usbnet_going_away() */
> +}

I have sort of an inverse question to what Paolo asked :)
AFAIU we need the double-cancel because checking the flag and
scheduling are not atomic. But if we do that why the memory
barriers? They make it seem like we're doing something clever
with memory ordering, while really we're just depending on normal
properties of the tasklet/timer/work APIs.

FTR disable_work_sync() would work nicely here but it'd be
a PITA for backports.

Also - is this based on some report or syzbot? I'm a bit tempted
to put this in net-next given how unlikely the race is vs how
commonly used the driver is.
--
pw-bot: cr