Re: [PATCH 05/14] mwifiex: re-register wiphy across reset

From: Johannes Berg
Date: Thu Jun 22 2017 - 09:02:59 EST


On Wed, 2017-06-21 at 11:27 -0700, Brian Norris wrote:
>
> > I'm not sure what you mean by "we need to atually stop all the
> > virtual interfaces ([...]) first".
>
> Judging by your following comments, I may have been completely
> mistaken.
> (But that's why I asked you folks!)

:)

> > There are essentially only two/three ways to reach this - data
> > path,
> > which is getting stopped here, and control path (both nl80211 and
> > perhaps ndo ops like start/stop).
>
> I think I was conflating virtual interfaces with control path (e.g.,
> nl80211 scans, set freq, etc.). The idea is that control operations
> may still get *started* after the above, and it's just plain
> impossible to resolve the races with driver queue teardown if we're
> queueing up new control ops at the same time.

Agree.

> But even if we kill off the wireless_dev's, I suppose there are still
> control interfaces that can talk directly to the wiphy.

Yeah, only a few.

> > Without checking the code now, it seems entirely plausible that
> > this is
> > holding some lock that would lock out the control path entirely,
> > for
> > the duration until the wiphy is actually unregistered?
> >
> > Actually, you can't unregister with the relevant locks held
> > (without
> > causing deadlocks), so perhaps it's marking the wiphy as
> > unavailable so
> > that all operations fail?
>
> One of the above two sounds along the right line. But it's something
> I couldn't really figure out how to do quite right.
>
> Dumb question: how would I mark the wiphy as unavailable? Is there
> something I can do at the cfg80211 level? Or would I really have to
> guard all the cfg80211 entry points into mwifiex with a flag or lock?

There isn't really a good way to do this. You can, of course, call
wiphy_unregister(), but if you could do that you'd already have the
problem solved, I think?

I'm not really familiar enough with the context this happens in - can't
you let all the operations that try to talk to the firmware fail
(because the firmware is dead, or whatever) and then call
wiphy_unregister()?

> Also, IIUC, we need to wait for all control paths to complete (or
> cancel) before we can free up the associated resources; so just
> marking "unavailable" isn't enough.

Yeah, I suppose so. Though if you just do all the freeing after
wiphy_unregister() it'll do that for you?

johannes