Re: TI PCIe xHCI and kexec

From: Mathias Nyman
Date: Thu Feb 06 2020 - 09:23:16 EST


On 6.2.2020 5.37, Joel Stanley wrote:
> On Wed, 5 Feb 2020 at 09:35, Mathias Nyman
> <mathias.nyman@xxxxxxxxxxxxxxx> wrote:
>>
>> On 5.2.2020 2.55, Joel Stanley wrote:
>>> I'm supporting a system that uses Linux-as-a-bootloader to load a
>>> distro kernel via kexec, The systems have a TI TUSB73x0 PCIe
>>> controller which goes out to lunch after a kexec. This is the distro
>>> (post-kexec) kernel:
>>>
>>> [ 0.235411] pci 0003:01:00.0: xHCI HW did not halt within 16000
>>> usec status = 0x0
>>> [ 1.037298] xhci_hcd 0003:01:00.0: xHCI Host Controller
>>> [ 1.037367] xhci_hcd 0003:01:00.0: new USB bus registered, assigned
>>> bus number 1
>>> [ 1.053481] xhci_hcd 0003:01:00.0: Host halt failed, -110
>>> [ 1.053523] xhci_hcd 0003:01:00.0: can't setup: -110
>>> [ 1.053565] xhci_hcd 0003:01:00.0: USB bus 1 deregistered
>>> [ 1.053629] xhci_hcd 0003:01:00.0: init 0003:01:00.0 fail, -110
>>> [ 1.053703] xhci_hcd: probe of 0003:01:00.0 failed with error -110
>>>

>>>
>>> 0003:01:00.0 USB controller: Texas Instruments TUSB73x0 SuperSpeed USB
>>> 3.0 xHCI Host Controller (rev 02)
>>>
>>> The full debug log of the distro kernel booting is below.
>>>
>>> [ 1.037833] xhci_hcd 0003:01:00.0: USBCMD 0x0:
>>> [ 1.037835] xhci_hcd 0003:01:00.0: HC is being stopped
>>> [ 1.037837] xhci_hcd 0003:01:00.0: HC has finished hard reset
>>> [ 1.037839] xhci_hcd 0003:01:00.0: Event Interrupts disabled
>>> [ 1.037841] xhci_hcd 0003:01:00.0: Host System Error Interrupts disabled
>>> [ 1.037843] xhci_hcd 0003:01:00.0: HC has finished light reset
>>> [ 1.037846] xhci_hcd 0003:01:00.0: USBSTS 0x0:
>>> [ 1.037847] xhci_hcd 0003:01:00.0: Event ring is empty
>>> [ 1.037849] xhci_hcd 0003:01:00.0: No Host System Error
>>> [ 1.037851] xhci_hcd 0003:01:00.0: HC is running
>>
>> Hmm, all bits in both USBCMD and USBSTS are 0. This is a bit suspicious.
>> Normally at least USBCMD Run/Stop bit, and USBSTS HCHalted bit have
>> opposite values.
>
> Does this suggest the controller is not responding at all?
>

The Capability registers looks fine, so does port status registers.
It's just the operational USBSTS and USBCMD registers that return 0.

Current xhci implementation assumes host failed to halt because USBSTS
HCHalted bit is still 0, and bails out before reset.
Host is probably not running, register just returns all zero.

Can you try if the below code works, it checks if host is running from
an additional place, and continues with the host reset.

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index fe38275363e0..2dbfeaf88574 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -177,8 +177,16 @@ int xhci_reset(struct xhci_hcd *xhci)
}

if ((state & STS_HALT) == 0) {
- xhci_warn(xhci, "Host controller not halted, aborting reset.\n");
- return 0;
+ /*
+ * After a kexec TI TUSB73x0 might appear running as its USBSTS
+ * and USBCMD registers return all zeroes. Doublecheck if host
+ * is running from USBCMD RUN bit before bailing out.
+ */
+ command = readl(&xhci->op_regs->command);
+ if (command & CMD_RUN) {
+ xhci_warn(xhci, "Host controller not halted, aborting reset.\n");
+ return 0;
+ }
}

xhci_dbg_trace(xhci, trace_xhci_dbg_init, "// Reset the HC");
@@ -5217,7 +5225,7 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
/* Make sure the HC is halted. */
retval = xhci_halt(xhci);
if (retval)
- return retval;
+ xhci_warn(xhci, "Continue with reset even if host appears running\n");

xhci_zero_64b_regs(xhci);