Re: [PATCH] mwifiex: don't call del_timer_sync() on uninitialized timer

From: Brian Norris
Date: Tue Jul 28 2020 - 14:46:11 EST


Hi,

On Mon, Jul 27, 2020 at 6:45 PM Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>
> syzbot is reporting that del_timer_sync() is called from
> mwifiex_usb_cleanup_tx_aggr() from mwifiex_unregister_dev() without
> checking timer_setup() from mwifiex_usb_tx_init() was called [1].
> Since mwifiex_usb_prepare_tx_aggr_skb() is calling del_timer() if
> is_hold_timer_set == true, use the same condition for del_timer_sync().
>
> [1] https://syzkaller.appspot.com/bug?id=fdeef9cf7348be8b8ab5b847f2ed993aba8ea7b6
>
> Reported-by: syzbot <syzbot+373e6719b49912399d21@xxxxxxxxxxxxxxxxxxxxxxxxx>
> Cc: Ganapathi Bhat <gbhat@xxxxxxxxxxx>
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> ---
> A patch from Ganapathi Bhat ( https://patchwork.kernel.org/patch/10990275/ ) is stalling
> at https://lore.kernel.org/linux-usb/MN2PR18MB2637D7C742BC235FE38367F0A09C0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/ .
> syzbot by now got this report for 10000 times. Do we want to go with this simple patch?

Sorry, that stall is partly my fault, and partly Ganapathi's. It
doesn't help that it took him 4 months to reply to my questions, so I
completely lost even the tiny bit of context I had managed to build up
in my head at initial review time... and so it's still buried in the
dark corners of my inbox. (I think I'll go archive that now, because
it really deserves a better sell than it had initially, if Ganapathi
really wants to land it.)

> drivers/net/wireless/marvell/mwifiex/usb.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/wireless/marvell/mwifiex/usb.c b/drivers/net/wireless/marvell/mwifiex/usb.c
> index 6f3cfde..04a1461 100644
> --- a/drivers/net/wireless/marvell/mwifiex/usb.c
> +++ b/drivers/net/wireless/marvell/mwifiex/usb.c
> @@ -1353,7 +1353,8 @@ static void mwifiex_usb_cleanup_tx_aggr(struct mwifiex_adapter *adapter)
> skb_dequeue(&port->tx_aggr.aggr_list)))
> mwifiex_write_data_complete(adapter, skb_tmp,
> 0, -1);
> - del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer);
> + if (port->tx_aggr.timer_cnxt.is_hold_timer_set)

I believe if we ever actually started aggregation, then the timer can
be active at this point, and thus, the access to 'is_hold_timer_set'
is racy.

This *probably* deserves a better refactor, but in absence of that
(and a better explanation than Ganapathi gave), I think you at least
need to hold port->tx_aggr_lock. So perhaps (totally untested):

spin_lock_bh(&port->tx_aggr_lock);
if (port->tx_aggr.timer_cnxt.is_hold_timer_set) {
port->tx_aggr.timer_cnxt.is_hold_timer_set = false;
spin_unlock_bh(&port->tx_aggr_lock);
/* Timer could still be running, but it can't be restarted at this
point, so this is safe. */
del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer);
} else {
spin_unlock_bh(&port->tx_aggr_lock);
}

Otherwise, I think this is fine:

Reviewed-by: Brian Norris <briannorris@xxxxxxxxxxxx>

I also believe mwifiex_usb_prepare_tx_aggr_skb() needs to stop using
del_timer() (without the _sync()), because otherwise we might have
deactivated the timer already but not ensured that it has completely
finished executing on other CPUs. But that is probably orthogonal to
the current patch. (Again, so much in this driver needs refactoring.)

Side note: this entire TX aggregation feature for USB has been hidden
behind the mwifiex.aggr_ctrl module param since its introduction,
which has always been disabled by default. I wonder whether anybody is
*really* testing it, or whether it's 100% broken, as with many things
in this driver...


Brian

> + del_timer_sync(&port->tx_aggr.timer_cnxt.hold_timer);
> port->tx_aggr.timer_cnxt.is_hold_timer_set = false;
> port->tx_aggr.timer_cnxt.hold_tmo_msecs = 0;
> }
> --
> 1.8.3.1
>