Re: Should xhci_irq() call usb_hc_died()?

From: Mathias Nyman
Date: Mon Dec 12 2016 - 05:48:00 EST


On 12.12.2016 10:43, Felipe Balbi wrote:

Hi,

Bjorn Helgaas <helgaas@xxxxxxxxxx> writes:
Hi Mathias,

ehci_irq(), ohci_irq(), fotg210_irq(), and oxu210_hcd_irq() contain code
equivalent to this:

status = ehci_readl(...);
if (status == ~(u32) 0) {
...
usb_hc_died(hcd);
...
return IRQ_HANDLED;
}

xhci_irq() has a similar check, but does not call usb_hc_died():

status = readl(...);
if (status = 0xffffffff) {
...
return IRQ_HANDLED;
}

Should xhci_irq() also call usb_hc_died()? Maybe there's some reason
for it to be different than the others, but it wasn't obvious to this
casual observer :)

It probably should, I'm not aware of any reason why not, and a quick look at the
logs didn't reveal anything.

Currently we are calling usb_hcd_died() in a couple of timeout cases if we read
0xffffffff from the pci registers, So eventually usb_hc_died() will be called.

I'll take a look at this in more detail


you might just have fixed several bugs in dealing with a dead HC :-)

Can you provide a patch? (well, unless Mathias has a strong reason not
to call usb_hc_died(), of course).

I don't think this is the worst case, there are a couple of other reasons such as
normal pci remove case we halt the host and reset the hardware after first HCD (USB2)
is removed, with all the secondary HCD (USB3) sand all its devices still connected,

Or then the abnormal case where HC disappears, we may time out while giving back a
killed URB, and may end up never returning it. USB core waits with the roothub device
lock held for the URB, and we try to tear down xhci, which also requires the roothub
device lock at some point -> deadlock.

I'm am looking at these, but I need to make sure i fix it properly and not cause even
more issues.

-Mathias