Re: [PATCH 5.13 024/800] usb: renesas-xhci: Fix handling of unknown ROM state

From: Moritz Fischer
Date: Tue Jul 20 2021 - 01:38:56 EST


On Mon, Jul 19, 2021 at 09:57:00PM -0500, Justin Forbes wrote:
> On Mon, Jul 19, 2021 at 10:33 AM Justin Forbes <jmforbes@xxxxxxxxxxx> wrote:
> >
> > On Sat, Jul 17, 2021 at 5:33 PM Moritz Fischer <mdf@xxxxxxxxxx> wrote:
> > >
> > > Justin,
> > >
> > > On Sat, Jul 17, 2021 at 08:39:19AM -0500, Justin Forbes wrote:
> > > > On Mon, Jul 12, 2021 at 2:31 AM Greg Kroah-Hartman
> > > > <gregkh@xxxxxxxxxxxxxxxxxxx> wrote:
> > > > >
> > > > > From: Moritz Fischer <mdf@xxxxxxxxxx>
> > > > >
> > > > > commit d143825baf15f204dac60acdf95e428182aa3374 upstream.
> > > > >
> > > > > The ROM load sometimes seems to return an unknown status
> > > > > (RENESAS_ROM_STATUS_NO_RESULT) instead of success / fail.
> > > > >
> > > > > If the ROM load indeed failed this leads to failures when trying to
> > > > > communicate with the controller later on.
> > > > >
> > > > > Attempt to load firmware using RAM load in those cases.
> > > > >
> > > > > Fixes: 2478be82de44 ("usb: renesas-xhci: Add ROM loader for uPD720201")
> > > > > Cc: stable@xxxxxxxxxxxxxxx
> > > > > Cc: Mathias Nyman <mathias.nyman@xxxxxxxxx>
> > > > > Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > > Cc: Vinod Koul <vkoul@xxxxxxxxxx>
> > > > > Tested-by: Vinod Koul <vkoul@xxxxxxxxxx>
> > > > > Reviewed-by: Vinod Koul <vkoul@xxxxxxxxxx>
> > > > > Signed-off-by: Moritz Fischer <mdf@xxxxxxxxxx>
> > > > > Link: https://lore.kernel.org/r/20210615153758.253572-1-mdf@xxxxxxxxxx
> > > > > Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
> > > > >
> > > >
> > > > After sending out 5.12.17 for testing, we had a user complain that all
> > > > of their USB devices disappeared with the error:
> > > >
> > > > Jul 15 23:18:53 kernel: xhci_hcd 0000:04:00.0: Direct firmware load
> > > > for renesas_usb_fw.mem failed with error -2
> > > > Jul 15 23:18:53 kernel: xhci_hcd 0000:04:00.0: request_firmware failed: -2
> > > > Jul 15 23:18:53 kernel: xhci_hcd: probe of 0000:04:00.0 failed with error -2
> > >
> > > This looks like it fails finding the actual firmware file (ENOENT). Any
> > > chance you could give this a whirl on top of the original patch?
> > >
> >
> > Sure. test kernel building now, will let you know when the user reports back.
>
> The original user reports success with this patch on top of the original patch.

That's good news I guess.

After reading through the datasheet once more I'm even more convinced
that the original code with the early return in
renesas_check_fw_running() is *very* shady.

There are three statuses to be investigated
- FW load status (fw_state)
- ROM download status (rom_status)
- Firmware version as reported by chip

Currently there the code takes an early return if the latter says the
external ROM is there and the 'write firmware to external ROM'
worked out, which I think shouldn't be happening, since it doesn't tell
us anything about the firmware state at all. In fact I think the early
return should not exist at all (a path that the original patch made more
likely to happen).

The FW load status indicates whether firmware has been runtime loaded
and returns 'No result yet' in your case, too I suspect, which *might*
happen if the chip configured itself from external ROM?

So the part that is unclear to me somewhat is should we use either of
them at all in trying to determine whether we should load firmware?

Maybe what we should do is:
- Attempt to request_firmware()
- If fail -> proceed and hope for the best
- If success
- Compare the firmware file version with the version reported by the
controller
- If they don't match, load firmware, otherwise leave it alone?

- Moritz


>
> Justin
>
> >
> > Justin
> >
> > > diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> > > index 18c2bbddf080..cde8f6f1ec5d 100644
> > > --- a/drivers/usb/host/xhci-pci.c
> > > +++ b/drivers/usb/host/xhci-pci.c
> > > @@ -379,7 +379,11 @@ static int xhci_pci_probe(struct pci_dev *dev, const struct pci_device_id *id)
> > > driver_data = (struct xhci_driver_data *)id->driver_data;
> > > if (driver_data && driver_data->quirks & XHCI_RENESAS_FW_QUIRK) {
> > > retval = renesas_xhci_check_request_fw(dev, id);
> > > - if (retval)
> > > + /*
> > > + * If firmware wasn't found there's still a chance this might work without
> > > + * loading firmware on some systems, so let's try at least.
> > > + */
> > > + if (retval && retval != -ENOENT)
> > > return retval;
> > > }
> > >
> > >
> > > Thanks,
> > > Moritz