Re: [PATCH] drivers: staging: r8188eu: Fix sleep-in-atomic-context bug in rtw_join_timeout_handler

From: duoming
Date: Fri Oct 21 2022 - 02:32:39 EST


Hello,

On Thu, 20 Oct 2022 17:46:47 +0200 Greg KH wrote:

> On Tue, Oct 18, 2022 at 04:34:24PM +0800, Duoming Zhou wrote:
> > The rtw_join_timeout_handler() is a timer handler that
> > runs in atomic context, but it could call msleep().
> > As a result, the sleep-in-atomic-context bug will happen.
> > The process is shown below:
> >
> > (atomic context)
> > rtw_join_timeout_handler
>
> Wait, how is this an atomic timeout?

Because this function is defined as a timer handler of "assoc_timer".

The following is the detail:

void rtw_init_mlme_timer(struct adapter *padapter)
{
struct mlme_priv *pmlmepriv = &padapter->mlmepriv;

timer_setup(&pmlmepriv->assoc_timer, rtw_join_timeout_handler, 0);
...
}

https://elixir.bootlin.com/linux/latest/source/drivers/staging/r8188eu/os_dep/mlme_linux.c#L36

> When can that happen?

When the adapter trys to join and selects scanning queue successfully,
the assoc_timer will be actived. If this process is timeout, the callback
function rtw_join_timeout_handler will run.

> > _rtw_join_timeout_handler
> > rtw_do_join
> > rtw_select_and_join_from_scanned_queue
> > rtw_indicate_disconnect
> > rtw_lps_ctrl_wk_cmd
> > lps_ctrl_wk_hdl
> > LPS_Leave
> > LPS_RF_ON_check
> > msleep //sleep in atomic context
>
> How was this found?
>
> > Fix by removing msleep() and replacing with mdelay().
>
> Wouldn't people have seen an error already if msleep() was really called
> in atomic context?

I am sorry, This is the false alarm.

rtw_indicate_disconnect()
-->rtw_lps_ctrl_wk_cmd(padapter, LPS_CTRL_DISCONNECT, 1);

u8 rtw_lps_ctrl_wk_cmd(struct adapter *padapter, u8 lps_ctrl_type, u8 enqueue)
{
...
if (enqueue) {
...
}else {
lps_ctrl_wk_hdl(padapter, lps_ctrl_type);
}

The enqueue equals to 1 and the lps_ctrl_wk_hdl() will not execute.

I will check carefully and avoid false alarm next time. Thank you for your reply.

Best regards,
Duoming Zhou