Re: 5.5-rc1 oops on boot in 802.11 kernel driver
From: Toke HÃiland-JÃrgensen
Date: Mon Dec 09 2019 - 06:11:13 EST
Kalle Valo <kvalo@xxxxxxxxxxxxxx> writes:
> Hi Steve,
>
> Steve French <smfrench@xxxxxxxxx> writes:
>
>> Noticed this crash in the Linux kernel Wifi driver on boot a few
>> minutes ago immediately after updating to latest mainline kernel about
>> an hour ago. I didn't see it last week and certainly not in 5.4.
>
> please CC linux-wireless on all wireless related problems, we don't
> follow lkml very closely and I found your email just by chance.
>
> Full warning below. Steve is using iwlwifi.
Right, we already got a similar report off-list, but with a different
stack trace. I was going to try to reproduce this on my own machine
today. However, the fact that this includes the iwl_mvm_tx_reclaim()
function may be a hint; that code seems to be reusing skbs without
freeing them?
If I'm reading the code correctly, it seems the reuse leads to the same
skb being passed to ieee80211_tx_status() multiple times; the driver is
clearing info->status, but since we added the info->tx_time_est field,
that would lead to double-accounting of that SKB, which would explain
the warning?
Can someone familiar with iwlwifi confirm that this is indeed what that
code is supposed to be doing? If it is, I think it needs the patch
below; however, if I'm wrong, then clearing the field could lead to the
opposite problem (that skbs fail to be accounted at all), which would
lead to the queue being throttled because the limit gets too high and is
never brought back down...
-Toke
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
index dc5c02fbc65a..7d822445730c 100644
--- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c
@@ -1848,6 +1848,7 @@ static void iwl_mvm_tx_reclaim(struct iwl_mvm *mvm, int sta_id, int tid,
iwl_trans_free_tx_cmd(mvm->trans, info->driver_data[1]);
memset(&info->status, 0, sizeof(info->status));
+ info->tx_time_est = 0;
/* Packet was transmitted successfully, failures come as single
* frames because before failing a frame the firmware transmits
* it without aggregation at least once.