Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within aUSB payload burst

From: Sarah Sharp
Date: Fri Jan 03 2014 - 14:55:26 EST


On Fri, Jan 03, 2014 at 07:40:33AM -0800, walt wrote:
> On 01/02/2014 11:15 AM, Sarah Sharp wrote:
> > On Tue, Dec 31, 2013 at 12:40:16PM -0800, walt wrote:
> >> On 12/18/2013 01:11 PM, Greg Kroah-Hartman wrote:
> >>> 3.12-stable review patch. If anyone has any objections, please let me know.
> >>>
> >>> ------------------
> >>>
> >>> From: David Laight <David.Laight@xxxxxxxxxx>
> >>>
> >>> commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e upstream.
> >>>
> >>> Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link TRB
> >>> can only occur at a boundary between underlying USB frames (512 bytes for
> >>> high speed devices).
> >>>
> >>> If this isn't done the USB frames aren't formatted correctly and, for example,
> >>> the USB3 ethernet ax88179_178a card will stop sending...
> >>
> >>
> >> Unfortunately this patch causes a regression when copying large files to my
> >> outboard USB3 drive. (Nothing at all to do with networking.)
>
> > Do you have CONFIG_USB_DEBUG turned on for 3.13? If so, you should see
> > dmesg output from this statement shortly before your drive fails:
> >
> > if (num_trbs >= TRBS_PER_SEGMENT) {
> > xhci_err(xhci, "Too many fragments %d, max %d\n",
> > num_trbs, TRBS_PER_SEGMENT - 1);
> > return -ENOMEM;
> > }
>
> Well, the answers depend on whether the usb3 drive uses logical volumes or not
> (lvm2), which I can't explain. What I've described so far is with lvm2.
>
> When using lvm2 on the usb3 drive, turning on USB_DEBUG has *no* effect -- the
> console prints two or three lines stating that the ext4 journal has quit and
> the drive is remounted ro. That particular drive stays wedged until the next
> reboot, but no other ill effects to the system.

Odd. In 3.12 xHCI has dynamic debugging, and turning on CONFIG_USB_DEBUG
should turn on debugging by default, so it's confusing that you didn't
see any messages.

Can I see your .config from /boot/? Also, did you try capturing dmesg
with `tail -f /var/log/kern.log` or just dmesg? Perhaps you need to run
`sudo dmesg -n 7`?

> OTOH, when I put a disk with just an ordinary ext4 partition in the usb3 dock,
> (no logical volumes) the copy failure becomes catastrophic, with kernel panic
> messages, leaving the system unresponsive and needing a hard reset to recover.
>
> I also tried your other suggestion:
>
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 4265b48..1a6a43d 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -4714,7 +4714,7 @@ int xhci_gen_setup(struct usb_hcd *hcd, xhci_get_quirks_t get_quirks)
> int retval;
>
> /* Accept arbitrarily long scatter-gather lists */
> - hcd->self.sg_tablesize = ~0;
> + hcd->self.sg_tablesize = 31;
>
> /* support to build packet from discontinuous buffers */
> hcd->self.no_sg_constraint = 1;
>
> Sadly it didn't fix the problem. Did I get the patch right?

Yes, you did. So perhaps the patch triggers a different bug. I can't
tell until I see xHCI debugging output.

> Thanks for your help, and I'm happy to try more ideas, as always.

Thanks for your patience. :)

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/