Re: [PATCH v3 0/8] rtw88: prepare locking for SDIO support

From: Johannes Berg
Date: Thu Feb 03 2022 - 17:26:38 EST


Hi,

Sorry, I kind of saw this fly by, read over it and wasn't sure I should
take a closer look, and then promptly forgot about it ...

> > So, I think the easiest way is to maintains the vif/sta lists in driver when
> > ::{add,remove }_interface/::sta_{add,remove}, and hold rtwdev->mutex lock to
> > access these lists. But, Johannes pointed out this is not a good idea [1].

> Thank you for this detailed explanation! I appreciate that you took
> the time to clearly explain this.
>
> For the sta use-case I thought about adding a dedicated rwlock
> (include/linux/rwlock.h) for rtw_dev->mac_id_map.

You can't sleep under an rwlock either though? Wasn't that the point?

> rtw_sta_{add,remove} would take a write-lock.
> rtw_iterate_stas() takes the read-lock (the lock would be acquired
> before calling into ieee80211_iterate_...). Additionally
> rtw_iterate_stas() needs to check if the station is still valid
> according to mac_id_map - if not: skip/ignore it for that iteration.

All of that "still valid" seems fragile though, IMHO. Hard to reason
that it's not racy or prone to use-after-free situations.

IIUC, you're trying to iterate interfaces and stations in a sleepable
context, but you're worried about deadlocks with locking in mac80211, if
ieee80211_iterate_interfaces() or ieee80211_iterate_stations() take
locks that we already hold due to coming into the code from mac80211? Or
about ABBA deadlocks arising from this?

IMHO you should still do it, and just be careful about the locking. I'd
be happy to add APIs for e.g. ieee80211_iterate_stations_locked() when
you know you already hold local->sta_mtx, though whether or not that can
even happen today in any callbacks I don't know, haven't audited it, but
it shouldn't be that hard to audit the path(s) you want to create?
Unless you need this in some very frequently called function ...


Now all of this is potentially also (and just as) error prone as doing
your own iteration machinery, however, my longer-term plan is to unify
the locking more. Now that we're no longer relying on the RTNL so much,
I'm planning to see if we couldn't get rid of most locks in mac80211.
The thing is, that most of the time we're doing all this fine-grained
locking ... under the wdev->mtx, so it's entirely pointless, and in fact
just a bunch of extra work.

So now that from cfg80211 perspective we have everything running under
wdev lock, I think we could in mac80211 just remove all the locks and
take the wdev lock in all the async workers. We already do in many I
guess, or we take local->mtx which is kind of equivalent anyway.

The question is then if we keep wdev->mtx at all, and I'm not really
sure about that yet. There are arguments both ways, and maybe that means
we'd move some things under there rather than under the overall mutex.
One of the strongest arguments for this is probably that mac80211
doesn't really do anything expensive (there aren't really any expensive
calculations in it), so most time spent is in drivers ... and guess
what, drivers mostly just have their own global lock they take for
everything, which makes sense since they need to talk to the device or
firmware.

However we answer that question though, and I'm trending towards
removing the wdev/sdata locks, I (now) think all the mutexes that we
have at the 'local' level in mac80211 are fairly much pointless.

In essence then, with some work in the stack, we could basically have
all calls (apart from TX) into the drivers done with the wiphy->mtx
held, at which point drivers don't even really need their own lock (they
can acquire wiphy_lock() too in workers). Then this whole thing really
all collapses down to being able to do the iteration, just having to
ensure you wiphy_lock() your calls that aren't coming from mac80211 in
the first place.

I really think that's the medium-term strategy, and any help would be
welcome. I'm not sure I can dedicate time to this soon, and the RTNL
rework took I think over a year after I started thinking about it, so I
guess we'll see...

We can do this incrementally btw, like remove local->mtx now and use
wiphy_lock() in its place, then remove local->sta_mtx, local->iface_mtx,
one by one. The RTNL rework was one of the major stepping stones to this
I think, because it ensured a consistent handling of the wiphy_lock() in
cfg80211.

I realize this doesn't help you in the short term, but if you do a lot
of complicated things now, it'll also be hard to get rid of. If you use
the normal iteration functions now it'll be harder to reason about now,
but will sort of automatically become easier once the lock redux work I
outlined above starts. And like I said, any help welcome :)

johannes