Re: Bug in add_dma_entry()'s debugging code

From: Alan Stern
Date: Mon Nov 27 2023 - 11:51:27 EST

Next message: Lizhi Hou: "Re: [PATCH v3 5/5] dmaengine: xilinx: xdma: Ease dma_pool alignment requirements"
Previous message: callahanbryson63: "Building Estimates"
In reply to: Christoph Hellwig: "Re: Bug in add_dma_entry()'s debugging code"
Next in thread: Christoph Hellwig: "Re: Bug in add_dma_entry()'s debugging code"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Nov 27, 2023 at 05:07:59PM +0100, Christoph Hellwig wrote:
> On Mon, Nov 27, 2023 at 11:02:20AM -0500, Alan Stern wrote:
> > All it looks for is mappings that start on the same cache line. However
> > on architectures that have cache-coherent DMA (such as x86), touching
> > the same cache line does not mean that two DMA mappings will interfere
> > with each other. To truly overlap, they would have to touch the same
> > _bytes_.
>
> But that is a special case that does not matter. Linux drivers need
> to be written in a portable way, and that means we have to care about
> platforms that are not DMA coherent.

The buffers in the bug report were allocated by kmalloc(). Doesn't
kmalloc() guarantee that on architectures with non-cache-coherent DMA,
allocations will be aligned on cache-line boundaries (or larger)? Isn't
this what ARCH_DMA_MINALIGN and ARCH_KMALLOC_MINALIGN are supposed to
take care of in include/linux/slab.h?

> > How should this be fixed? Since the check done in add_dma_entry() is
> > completely invalid for x86 and similar architectures, should it simply
> > be removed for them? Or should the check be enhanced to look for
> > byte-granularity overlap?
>
> The patch is every but "completely invalid". It points out that you
> violate the Linux DMA API requirements.

Since when does the DMA API require that mappings on x86 must be to
separate cache lines? Is this documented anywhere?

For that matter, Documentation/core-api/dma-api-howto.rst explicitly
says:

If you acquired your memory via the page allocator
(i.e. __get_free_page*()) or the generic memory allocators
(i.e. kmalloc() or kmem_cache_alloc()) then you may DMA to/from
that memory using the addresses returned from those routines.

It also says:

Architectures must ensure that kmalloc'ed buffer is
DMA-safe. Drivers and subsystems depend on it. If an architecture
isn't fully DMA-coherent (i.e. hardware doesn't ensure that data in
the CPU cache is identical to data in main memory),
ARCH_DMA_MINALIGN must be set so that the memory allocator
makes sure that kmalloc'ed buffer doesn't share a cache line with
the others. See arch/arm/include/asm/cache.h as an example.

It says nothing about avoiding more than one DMA operation at a time to
prevent overlap. Is the documentation wrong?

> This might not have an
> effect on the particular plaform you are currently running on, but it
> is still wrong.

Who decides what is right and what is wrong in this area? Clearly you
have a different opinion from David S. Miller, Richard Henderson, and
Jakub Jelinek (the authors of that documentation file).

> Note that there have been various mumblings about
> using nosnoop dma on x86, in which case you'll have the issue there
> as well.

Unless the people who implement nosnoop DMA also modify kmalloc() or
ARCH_DMA_MINALIGN.

I guess the real question boils down to where the responsiblity should
lie. Should kmalloc() guarantee that the memory regions it provides
will always be suitable for DMA, with no overlap issues? Or should all
the innumerable users of kmalloc() be responsible for jumping through
hoops to request arch-specific minimum alignment for their DMA buffers?

Alan Stern

Next message: Lizhi Hou: "Re: [PATCH v3 5/5] dmaengine: xilinx: xdma: Ease dma_pool alignment requirements"
Previous message: callahanbryson63: "Building Estimates"
In reply to: Christoph Hellwig: "Re: Bug in add_dma_entry()'s debugging code"
Next in thread: Christoph Hellwig: "Re: Bug in add_dma_entry()'s debugging code"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]