Re: [PATCH] mac80211: fix race in ieee80211_register_hw()

From: Sumit Garg
Date: Mon Apr 06 2020 - 09:00:17 EST


On Mon, 6 Apr 2020 at 18:14, Johannes Berg <johannes@xxxxxxxxxxxxxxxx> wrote:
>
> On Mon, 2020-04-06 at 17:51 +0530, Sumit Garg wrote:
> > A race condition leading to a kernel crash is observed during invocation
> > of ieee80211_register_hw() on a dragonboard410c device having wcn36xx
> > driver built as a loadable module along with a wifi manager in user-space
> > waiting for a wifi device (wlanX) to be active.
> >
> > Sequence diagram for a particular kernel crash scenario:
> >
> > user-space ieee80211_register_hw() RX IRQ
> > +++++++++++++++++++++++++++++++++++++++++++++
> > | | |
> > |<---wlan0---wiphy_register() |
> > |----start wlan0---->| |
> > | |<---IRQ---(RX packet)
> > | Kernel crash |
> > | due to unallocated |
> > | workqueue. |
> > | | |
> > | alloc_ordered_workqueue() |
> > | | |
> > | Misc wiphy init. |
> > | | |
> > | ieee80211_if_add() |
> > | | |
> >
> > As evident from above sequence diagram, this race condition isn't specific
> > to a particular wifi driver but rather the initialization sequence in
> > ieee80211_register_hw() needs to be fixed.
>
> Indeed, oops.
>
> > So re-order the initialization
> > sequence and the updated sequence diagram would look like:
> >
> > user-space ieee80211_register_hw() RX IRQ
> > +++++++++++++++++++++++++++++++++++++++++++++
> > | | |
> > | alloc_ordered_workqueue() |
> > | | |
> > | Misc wiphy init. |
> > | | |
> > |<---wlan0---wiphy_register() |
> > |----start wlan0---->| |
> > | |<---IRQ---(RX packet)
> > | | |
> > | ieee80211_if_add() |
> > | | |
>
> Makes sense.
>
> > @@ -1254,6 +1250,14 @@ int ieee80211_register_hw(struct ieee80211_hw *hw)
> > local->sband_allocated |= BIT(band);
> > }
> >
> > + rtnl_unlock();
> > +
> > + result = wiphy_register(local->hw.wiphy);
> > + if (result < 0)
> > + goto fail_wiphy_register;
> > +
> > + rtnl_lock();
>
> I'm a bit worried about this unlock/relock here though.
>
> I think we only need the RTNL for the call to
> ieee80211_init_rate_ctrl_alg() and then later ieee80211_if_add(), so
> perhaps we can move that a little closer?
>

Sure, will move rtnl_unlock() to just after call to
ieee80211_init_rate_ctrl_alg().

> All the stuff between is really just setting up local stuff, so doesn't
> really need to worry?
>

Okay.

-Sumit

> johannes
>
>