Hi Mathias,
I have some comments for the implementation of xhci_abort_cmd_ring() below.
On 12/20/2016 11:13 PM, Mathias Nyman wrote:
On 20.12.2016 09:30, Baolin Wang wrote:
...
Alright, I gathered all current work related to xhci races and timeouts
and put them into a branch:
git://git.kernel.org/pub/scm/linux/kernel/git/mnyman/xhci.git timeout_race_fixes
Its based on 4.9
It includes a few other patches just to avoid conflicts and make my life easier
Interesting patches are:
ee4eb91 xhci: remove unnecessary check for pending timer
0cba67d xhci: detect stop endpoint race using pending timer instead of counter.
4f2535f xhci: Handle command completion and timeout race
b9d00d7 usb: host: xhci: Fix possible wild pointer when handling abort command
529a5a0 usb: xhci: fix possible wild pointer
4766555 xhci: Fix race related to abort operation
de834a3 xhci: Use delayed_work instead of timer for command timeout
69973b8 Linux 4.9
The fixes for command queue races will go to usb-linus and stable, the
reworks for stop ep watchdog timer will go to usb-next.
Still completely untested, (well it compiles)
Felipe gave instructions how to modify dwc3 driver to timeout on address
devicecommands to test these, I'll try to set that up.
All additional testing is welcome, especially if you can trigger timeouts
and races
-Mathias
Below is the latest code. I put my comments in line.
322 static int xhci_abort_cmd_ring(struct xhci_hcd *xhci)
323 {
324 u64 temp_64;
325 int ret;
326
327 xhci_dbg(xhci, "Abort command ring\n");
328
329 reinit_completion(&xhci->cmd_ring_stop_completion);
330
331 temp_64 = xhci_read_64(xhci, &xhci->op_regs->cmd_ring);
332 xhci_write_64(xhci, temp_64 | CMD_RING_ABORT,
333 &xhci->op_regs->cmd_ring);
We should hold xhci->lock when we are modifying xhci registers
at runtime.
The retry of setting CMD_RING_ABORT is not necessary according to
previous discussion. We have cleaned code for second try in
xhci_handle_command_timeout(). Need to clean up here as well.