Re: [PATCH 1/2] mwifiex: Use non-posted PCI register writes

From: Pali Rohár
Date: Thu Sep 30 2021 - 11:42:08 EST


On Thursday 30 September 2021 17:38:43 Jonas Dreßler wrote:
> On 9/23/21 10:22 PM, Pali Rohár wrote:
> > On Thursday 23 September 2021 22:41:30 Andy Shevchenko wrote:
> > > On Thu, Sep 23, 2021 at 6:28 PM Jonas Dreßler <verdre@xxxxxxx> wrote:
> > > > On 9/22/21 2:50 PM, Jonas Dreßler wrote:
> > >
> > > ...
> > >
> > > > - Just calling mwifiex_write_reg() once and then blocking until the card
> > > > wakes up using my delay-loop doesn't fix the issue, it's actually
> > > > writing multiple times that fixes the issue
> > > >
> > > > These observations sound a lot like writes (and even reads) are actually
> > > > being dropped, don't they?
> > >
> > > It sounds like you're writing into a not ready (fully powered on) device.
> >
> > This reminds me a discussion with Bjorn about CRS response returned
> > after firmware crash / reset when device is not ready yet:
> > https://lore.kernel.org/linux-pci/20210922164803.GA203171@bhelgaas/
> >
> > Could not be this similar issue? You could check it via reading
> > PCI_VENDOR_ID register from config space. And if it is not valid value
> > then card is not really ready yet.
> >
> > > To check this, try to put a busy loop for reading and check the value
> > > till it gets 0.
> > >
> > > Something like
> > >
> > > unsigned int count = 1000;
> > >
> > > do {
> > > if (mwifiex_read_reg(...) == 0)
> > > break;
> > > } while (--count);
> > >
> > >
> > > --
> > > With Best Regards,
> > > Andy Shevchenko
>
> I've tried both reading PCI_VENDOR_ID and the firmware status using a busy
> loop now, but sadly none of them worked. It looks like the card always
> replies with the correct values even though it sometimes won't wake up after
> that.
>
> I do have one new observation though, although I've no clue what could be
> happening here: When reading PCI_VENDOR_ID 1000 times to wakeup we can
> "predict" the wakeup failure because exactly one (usually around the 20th)
> of those 1000 reads will fail.

What does "fail" means here?

> Maybe the firmware actually tries to wake up,
> encounters an error somewhere in its wakeup routines and then goes down a
> special failure code path. That code path keeps the cards CPU so busy that
> at some point a PCI_VENDOR_ID request times out?
>
> Or well, maybe the card actually wakes up fine, but we don't receive the
> interrupt on our end, so many possibilities...