Re: [Bug] mt7921e driver in 5.16 causes kernel panic

From: Khalid Aziz
Date: Tue Jan 11 2022 - 21:16:32 EST


On 1/11/22 17:49, sean.wang@xxxxxxxxxxxx wrote:
From: Sean Wang <sean.wang@xxxxxxxxxxxx>

On 1/11/22 16:31, Ben Greear wrote:
On 1/11/22 3:17 PM, Khalid Aziz wrote:
I am seeing an intermittent bug in mt7921e driver. When the driver
module is loaded and is being initialized, almost every other time it
seems to write to some wild memory location. This results in driver
failing to initialize with message "Timeout for driver own" and at
the same time I start to see "Bad page state" messages for random
processes. Here is the relevant part of dmesg:

Please see if this helps?

From: Ben Greear <greearb@xxxxxxxxxxxxxxx>

If the nic fails to start, it is possible that the reset_work has
already been scheduled. Ensure the work item is canceled so we do not
have use-after-free crash in case cleanup is called before the work
item is executed.

This fixes crash on my x86_64 apu2 when mt7921k radio fails to work.
Radio still fails, but OS does not crash.

Signed-off-by: Ben Greear <greearb@xxxxxxxxxxxxxxx>
---
drivers/net/wireless/mediatek/mt76/mt7921/main.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c
b/drivers/net/wireless/mediatek/mt76/mt7921/main.c
index 6073bedaa1c08..9b33002dcba4a 100644
--- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c
+++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c
@@ -272,6 +272,7 @@ static void mt7921_stop(struct ieee80211_hw *hw)

cancel_delayed_work_sync(&dev->pm.ps_work);
cancel_work_sync(&dev->pm.wake_work);
+ cancel_work_sync(&dev->reset_work);
mt76_connac_free_pending_tx_skbs(&dev->pm, NULL);

mt7921_mutex_acquire(dev);

Hi Ben,

Unfortunately that did not help. I still saw the same messages and a kernel panic. I do not see this bug if I power down the laptop before booting it up, so mt7921_stop() would make sense as the reasonable place to fix it.

Hi, Khalid

Could you try the patch below? It should be helpful to your issue

https://patchwork.kernel.org/project/linux-wireless/patch/70e27cbc652cbdb78277b9c691a3a5ba02653afb.1641540175.git.objelf@xxxxxxxxx/

Hi Sean,

That worked! I tried 5 reboots back-to-back after applying your patch without powering down my laptop. There were no error messages, kernel came up every time and wifi worked.

Thanks,
Khalid