Re: [GIT PULL] dma-mapping fix for Linux 5.14

From: Stefano Stabellini
Date: Mon Jul 26 2021 - 16:03:58 EST


On Mon, 26 Jul 2021, Boris Ostrovsky wrote:
> On 7/25/21 12:50 PM, Linus Torvalds wrote:
> > On Sat, Jul 24, 2021 at 11:03 PM Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> >
> >> - handle vmalloc addresses in dma_common_{mmap,get_sgtable}
> >> (Roman Skakun)
> > I've pulled this, but my reaction is that we've tried to avoid this in
> > the past. Why is Xen using vmalloc'ed addresses and passing those in
> > to the dma mapping routines?
> >
> > It *smells* to me like a Xen-swiotlb bug, and it would have been
> > better to try to fix it there. Was that just too painful?
>
>
> Stefano will probably know better but this appears to have something to do with how Pi (and possibly more ARM systems?) manage DMA memory: https://lore.kernel.org/xen-devel/CADz_WD5Ln7Pe1WAFp73d2Mz9wxspzTE3WgAJusp5S8LX4=83Bw@xxxxxxxxxxxxxx/.

The original issue was found on the Raspberry Pi 4, and the fix was in
swiotlb-xen.c, commit 8b1e868f6. More recently, Roman realized that
dma_common_mmap might also end up calling virt_to_page on a vmalloc
address. This is the fix for that.


Why is Xen using vmalloc'ed addresses with dma routines at all?

Xen is actually just calling the regular dma_direct_alloc to allocate
pages (xen_swiotlb_alloc_coherent -> xen_alloc_coherent_pages ->
dma_direct_alloc). dma_direct_alloc is the generic implementation. Back
when the original issue was found, dma_direct_alloc returned a vmalloc
address on RPi4.

The original analysis was "xen_alloc_coherent_pages() eventually calls
arch_dma_alloc() in remap.c which successfully allocates pages from
atomic pool." See https://marc.info/?l=xen-devel&m=158878173207775.


I don't know on which platform Roman Skakun (CC'ed) found the problem.
But if we look at arch/arm/mm/dma-mapping.c:__dma_alloc, one of the
possible options is the "remap_allocator", which calls
__alloc_remap_buffer, which calls dma_common_contiguous_remap, which
calls vmap.

So unfortunately it seems that on certain arch/platforms
dma_alloc_coherent can return a vmap'ed address. So I would imagine this
issue could also happen on native (without Xen), at least in theory.