Re: [PATCH v2 2/2] mwifiex: Try waking the firmware until we get an interrupt

From: Jonas Dreßler
Date: Thu Sep 30 2021 - 14:04:12 EST


On 9/22/21 1:19 PM, Andy Shevchenko wrote:
On Tue, Sep 14, 2021 at 01:48:13PM +0200, Jonas Dreßler wrote:
It seems that the firmware of the 88W8897 card sometimes ignores or
misses when we try to wake it up by writing to the firmware status
register. This leads to the firmware wakeup timeout expiring and the
driver resetting the card because we assume the firmware has hung up or
crashed (unfortunately that's not unlikely with this card).

Turns out that most of the time the firmware actually didn't hang up,
but simply "missed" our wakeup request and didn't send us an AWAKE
event.

Trying again to read the firmware status register after a short timeout
usually makes the firmware wake up as expected, so add a small retry
loop to mwifiex_pm_wakeup_card() that looks at the interrupt status to
check whether the card woke up.

The number of tries and timeout lengths for this were determined
experimentally: The firmware usually takes about 500 us to wake up
after we attempt to read the status register. In some cases where the
firmware is very busy (for example while doing a bluetooth scan) it
might even miss our requests for multiple milliseconds, which is why
after 15 tries the waiting time gets increased to 10 ms. The maximum
number of tries it took to wake the firmware when testing this was
around 20, so a maximum number of 50 tries should give us plenty of
safety margin.

A good reproducer for this issue is letting the firmware sleep and wake
up in very short intervals, for example by pinging a device on the
network every 0.1 seconds.

...

+ do {
+ if (mwifiex_write_reg(adapter, reg->fw_status, FIRMWARE_READY_PCIE)) {
+ mwifiex_dbg(adapter, ERROR,
+ "Writing fw_status register failed\n");
+ return -EIO;
+ }
+
+ n_tries++;
+
+ if (n_tries <= N_WAKEUP_TRIES_SHORT_INTERVAL)
+ usleep_range(400, 700);
+ else
+ msleep(10);
+ } while (n_tries <= N_WAKEUP_TRIES_SHORT_INTERVAL + N_WAKEUP_TRIES_LONG_INTERVAL &&
+ READ_ONCE(adapter->int_status) == 0);

Can't you use read_poll_timeout() twice instead of this custom approach?


I've tried this now, but read_poll_timeout() is not ideal for our use-case. What we'd need would be read->sleep->poll->repeat instead of read->poll->sleep->repeat. With read_poll_timeout() we always end up doing one more (unnecessary) write.

+ mwifiex_dbg(adapter, EVENT,
+ "event: Tried %d times until firmware woke up\n", n_tries);