Re: [PATCH] wifi: mwifiex: Fix two buggy list traversals
From: Brian Norris
Date: Wed Jul 31 2024 - 16:10:13 EST
On Tue, Jul 30, 2024 at 11:05:30AM -0700, Calvin Owens wrote:
> Both of these list traversals use list_for_each_entry_safe(), yet drop
> the lock protecting the list during the traversal.
>
> Because the _safe() iterator stores a pointer to the next list node
> locally so the current node can be deleted, dropping the lock this way
> means the next "cached" list_head might be freed by another caller,
> leading the iterator to dereference pointers in freed memory after
> reacquiring the lock.
There are lots of unclear and/or unsound locking patterns in this
driver. You've probably identified one, although I don't think you've
solved 100% of it.
Here's another: is it valid for mwifiex_11n_rx_reorder_pkt() ->
mwifiex_11n_get_rx_reorder_tbl() to retrieve a 'tbl' pointer (without
removing it from the list), and then continue to operate on that without
holding any locks? (I think the answer is "no".)
Side note: you might also refer to this old thread:
https://lore.kernel.org/all/CAD=FV=VuxFtDdcMndLNzVYDoid8N3jP46j0sOFXG1D4CzX0=Zw@xxxxxxxxxxxxxx/
I don't think Marvell ever fully resolved all the issues there.
> Fix by moving to-be-deleted objects to an on-stack list before actually
> deleting them, so the lock can be held for the entire traversal.
>
> This is a bit ugly, because mwifiex_del_rx_reorder_entry() will still
> take the rx_reorder_tbl_lock to delete the item from the two on-stack
> lists introduced in this patch. But that is just ugly, not wrong, and
> the function has other callers... making the locking conditional seems
> strictly uglier.
I noticed this "ugliness", but I agree with your reasoning -- it's as
good as we can do here for now.
> I discovered this bug while studying the new "nxpwifi" driver, which was
> sent to the mailing list about a month ago:
>
> https://lore.kernel.org/lkml/20240621075208.513497-1-yu-hao.lin@xxxxxxx/
>
> ...but it turns out the new 11n_rxreorder.c in nxpwifi is essentially
> exactly identical to mwifiex, save for s/mwifiex/nxpwifi/, so I wanted
> to pass along a bugfix for the original driver as well.
That's another can of worms. mwifiex is horrible, and so if you were
asking me, I'd reject any attempt at copy/paste/modify that doesn't make
significant efforts to refactor and improve -- for instance, better
documentation about what all the locks mean, and clarity such that
readers can be confident that the code is doing the right thing. For
example, I think this mwifiex comment is a lie:
/* spin lock for rx_reorder_tbl_ptr queue */
spinlock_t rx_reorder_tbl_lock;
I believe it's supposed to protect the elements within the list too --
but it doesn't do a good job of that.
But that's a side track...
> I only have an IW612, so this patch was only tested on "nxpwifi".
I don't think we can accept an untested patch here. If you're lucky,
maybe I or someone else on CC can test for you though.
> Signed-off-by: Calvin Owens <calvin@xxxxxxxxxx>
> ---
> .../wireless/marvell/mwifiex/11n_rxreorder.c | 26 +++++++++----------
> 1 file changed, 12 insertions(+), 14 deletions(-)
I think the patch looks good enough, but I won't ack it without testing.
And while you're at it, I'd recommend some further auditing, per the
above.
Brian