-----Original Message-----
From: Mathias Nyman [mailto:mathias.nyman@xxxxxxxxxxxxxxx]
Sent: Wednesday, March 23, 2016 7:52 PM
To: Rajesh Bhagat <rajesh.bhagat@xxxxxxx>
Cc: gregkh@xxxxxxxxxxxxxxxxxxx; linux-usb@xxxxxxxxxxxxxxx; linux-
kernel@xxxxxxxxxxxxxxx; Sriram Dash <sriram.dash@xxxxxxx>
Subject: Re: [PATCH] usb: xhci: Fix incomplete PM resume operation due to XHCI
commmand timeout
On 23.03.2016 05:53, Rajesh Bhagat wrote:
to 0 by xHC.IMO, The assumption that "xhci_abort_cmd_ring would always generateis in bad state.
an event and handle_cmd_completion would be called" will not be
always be true if HW
Please share your opinion.
writing the CA (command abort) bit in CRCR (command ring control
register) will stop the command ring, and CRR (command ring running) will be set
completion event.xhci_abort_cmd_ring() polls this bit up to 5 seconds.
If it's not 0 then the driver considers the command abort as failed.
The scenario you're thinking of is that xHC would still react to CA
bit set, it would stop the command ring and set CRR 0, but not send a command
commands.
Have you tried adding some debug to handle_cmd_completion() and see
if you receive any event after command abortion?
Yes. We have added debug prints at first line of
handle_cmd_completion, and we are not getting those prints. The last
print messages that we get are as below from xhci_alloc_dev while
resume
operation:
xhci-hcd xhci-hcd.0.auto: Command timeout xhci-hcd xhci-hcd.0.auto:
Abort command ring
May be somehow, USB controller is in bad state and not responding to the
Please suggest how XHCI driver can handle such situations.
Restart the command timeout timer when writing the command abort bit.
If we get theIf we get the abort event the timer is deleted.
Otherwise if the timout triggers a second time we end up calling
xhci_handle_command_timeout() with a stopped ring, This will call
xhci_handle_stopped_cmd_ring(), turn the aborted command to no-op, restart the
command ring, and finally when the no-op completes it should call the missing
completion.
If command ring doesn't start then additional code could be added to
xhci_handle_command_timeout() that clears the command ring if it is called a second
time (=current command is already in abort state and command ring is stopped when
entering xhci_handle_command_timeout)
There might be some details missing, I'm not able to test any of this, but try
something like this:
diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c index
3e1d24c..576819e 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -319,7 +319,10 @@ static int xhci_abort_cmd_ring(struct xhci_hcd *xhci)
xhci_halt(xhci);
return -ESHUTDOWN;
}
-
+ /* writing the CMD_RING_ABORT bit should create a command completion
+ * event, add a command completion timeout for it as well
+ */
+ mod_timer(&xhci->cmd_timer, jiffies + XHCI_CMD_DEFAULT_TIMEOUT);
return 0;
}
Hello Mathias,
Thanks for the patch.
After application of above patch, I'm getting following prints constantly:
xhci-hcd xhci-hcd.0.auto: Command timeout
xhci-hcd xhci-hcd.0.auto: Abort command ring
xhci-hcd xhci-hcd.0.auto: Command timeout on stopped ring
xhci-hcd xhci-hcd.0.auto: Turn aborted command be56e000 to no-op
xhci-hcd xhci-hcd.0.auto: // Ding dong!
...
xhci-hcd xhci-hcd.0.auto: Command timeout
xhci-hcd xhci-hcd.0.auto: Abort command ring
xhci-hcd xhci-hcd.0.auto: Command timeout on stopped ring
xhci-hcd xhci-hcd.0.auto: Turn aborted command be56e000 to no-op
xhci-hcd xhci-hcd.0.auto: // Ding dong!
As expected, xhci_handle_command_timeout is called again and next time ring state
is __not__ CMD_RING_STATE_RUNNING, Hence xhci_handle_stopped_cmd_ring is called
which turn all the aborted commands to no-ops and again makes the ring state as
CMD_RING_STATE_RUNNING, and rings the door bell.
But again in this case, no response from USB controller and xhci_alloc_dev is still waiting for
wait_for_completion.
Does rescheduling the xhci->cmd_timer ensures command completion will be called. IMO, it is still
dependent on HW response.