Re: mwifiex firmware crash

From: Dave Olsthoorn
Date: Wed May 19 2021 - 15:20:29 EST


Hi,

I'll drop some of the people since this is a sub-thread of the original, I'll keep the lists for access to this using lore.kernel.org.

On 2021-05-15 18:53, Pali Rohár wrote:
Hello!

On Saturday 15 May 2021 18:32:30 Dave Olsthoorn wrote:
Hi,

On 2021-05-15 17:40, Pali Rohár wrote:
> On Saturday 15 May 2021 17:10:31 Dave Olsthoorn wrote:
> > The firmware still seems to crash quicker than previously, but
> > that's a
> > unrelated problem.
>
> Hello! Do you have some more details (or links) about mentioned firmware
> crash?

Sure, firmware crashes have always been a problem on the Surface devices.

What wifi chip you have on these devices? Because very similar firmware
crashes I see on 88W8997 chip (also with mwifiex) when wifi card is
configured in SDIO mode (not PCIe).


The Surface Pro 2017 has an 88W8897.

I know that there are new version of firmwares for these 88W8xxx chips,
but they are available only under NXP NDA and only for NXP customers.
So it looks like that end users with NXP wifi chips are out of luck.

They seem to be related, at least for some of the crashes, to power
management. For this reason I disabled powersaving in NetworkManager which
used to make it at least stable enough for me, in 5.13 this trick does not
seem to work.

The dmesg log attached shows a firmware crash happening, the card does not
work even after a reset or remove & rescan on the pci(e) bus.

Similar issue, card start working again only after whole system restart.

So this is something which can be resolved only in NXP.

After a conversation with the author of the patches, the problem is not the power management itself (for most hardware revisions [1]) but a race where pci commands are being written while the device is being put to sleep. A fix for this problem is included in the patches which make all pci commands synchronous instead of asynchronous [2].

After that a the wakeup patch seems relevant [3].

<snip>
There are patches [1] which have not been submitted yet and where developed
as part of the linux-surface effort [2]. From my experience these patches
resolve most if not all of the firmware crashes.

Is somebody going to cleanup these patches and send them for inclusion
into mainline kernel? I see that most of them are PCIe related, but due
to seeing same issues also on SDIO bus, I guess adding similar hooks
also for SDIO could make also SDIO more stable...

The author plans to upstream them, he just hasn't gotten around to it.

Regards,
Dave

[1]: https://github.com/linux-surface/linux-surface/blob/master/patches/5.12/0002-mwifiex.patch#L2237-L2338
[2]: https://github.com/linux-surface/linux-surface/blob/master/patches/5.12/0002-mwifiex.patch#L1152-L1207
[3]: https://github.com/linux-surface/linux-surface/blob/master/patches/5.12/0002-mwifiex.patch#L1992-L2079