Re: MT76x2U crashes XHCI driver on AMD Ryzen system

From: Stanislaw Gruszka
Date: Mon Feb 18 2019 - 09:38:31 EST


(cc: IOMMU & page_frag_alloc maintainers)

On Tue, Jan 15, 2019 at 10:04:01AM +0100, Lorenzo Bianconi wrote:
> > On Mon, Jan 14, 2019 at 1:18 AM Lorenzo Bianconi
> > <lorenzo.bianconi@xxxxxxxxxx> wrote:
> > >
> > > > On Sun, Jan 13, 2019 at 11:00 AM Lorenzo Bianconi
> > > > <lorenzo.bianconi@xxxxxxxxxx> wrote:
> > > > >
> > > > > >
> > > > > >
> > > > > > On Sun, Jan 13, 2019 at 5:33 AM, Lorenzo Bianconi <lorenzo.bianconi@xxxxxxxxxx> wrote:
> > > > > >
> > > > > > Direct. No VM used. This is the only peripheral causing this issue.
> > > > > >
> > > > > > Is the device connected to a usb3.0 port? If so, could you please try to connect the dongle to a 2.0 one?
> > > > > >
> > > > > > I tried through a USB 2.0 port. Shouldn't make a difference as they both use the xhci driver.
> > > > > >
> > > > >
> > > > > mt76x2u supports scatter-gather on usb 3.0 (not on 2.0)
> > > > Tried a USB 3 port. Same result.
> > > > >
> > > > > > Could you please double check if IOMMU is enabled?
> > > > > >
> > > > >
> > > > > Have you tried to disable it? Does it make any difference?
> > > > No idea how. UEFI doesn't seem to show anything similar.
> > > >
> > > > Similar bug report: https://bugzilla.kernel.org/show_bug.cgi?id=202241

FWIW: I provided some patches in the bugzilla, which were reported to
solve the problem. But I looking for confirmation if both are needed:

0001-mt76x02u-use-usb_bulk_msg-to-upload-firmware.patch
0002-mt76usb-do-not-use-compound-head-page-for-SG-I-O.patch

Or problem can be solved by just one of it (either first or second).

Additionally I'm not 100% sure if

0002-mt76usb-do-not-use-compound-head-page-for-SG-I-O.patch

is correct. So perhaps some IOMMU maintainer could look at it.

> > > You should be able to disable iommu using GRUB_CMDLINE_LINUX in
> > > /etc/default/grub (I guess setting iommu=off and reinstalling grub)
> > > https://wiki.gentoo.org/wiki/IOMMU_SWIOTLB
> > Yep. Working great now. I wonder what mt76 is doing to cause the crash though...
>
> Thanks for bisecting the issue.

Lorenzo, what you mean by 'bisecting' here ? Someone did 'git bisect'
on this issue?

> I think amd iommu does not support well usb scatter-gather
> (used by default in mt76u). I am working on a series in order to add the possibility to
> disable it.

Even if that true that AMD IOMMU does not support 'well' SG (what I think
is not true) disabling SG in mt76 driver is not right solution. Right
solution would be propagate the issue to AMD IOMMU maintainers
(already CCed).

One problem in mt76 is page_frag_alloc() usage with different sizes.
page_frag_alloc() unlike like other allocators do not assure alignment
and relay on callers to provide buffers sizes that are aligned.
Unaligned buffer might then not be appropriate for DMA.

Another issue is that dma_map_sg() & dma_map_page() may require some
constraints. I'm not sure about that and I want to clarify that with
CCed mm maintainers. I think DMA drivers may expect sg->offset < PAGE_SIZE
for both dma_map_sg() and dma_map_page(). Additionally dma_map_page()
maight expect that offset & length specify buffer within one page.

Stanislaw