On Mon, Aug 9, 2021 at 8:20 AM Xianting TIanArnd, thanks for info, according to the description, seems we need to apply this patch to riscv.
<xianting.tian@xxxxxxxxxxxxxxxxx> wrote:
The platform spec [1] says about this:I checked ARCH_DMA_MINALIGN definition, "If an architecture isn't fully+#define ARCH_DMA_MINALIGN L1_CACHE_BYTESIt's not a good idea to blindly set this for all riscv. For "coherent"
platforms, this is not necessary and will waste memory.
DMA-coherent, ARCH_DMA_MINALIGN must be set".
so that the memory allocator makes sure that kmalloc'ed buffer doesn't
share a cache line with the others.
Documentation/core-api/dma-api-howto.rst
2) ARCH_DMA_MINALIGN
Architectures must ensure that kmalloc'ed buffer is
DMA-safe. Drivers and subsystems depend on it. If an architecture
isn't fully DMA-coherent (i.e. hardware doesn't ensure that data in
the CPU cache is identical to data in main memory),
ARCH_DMA_MINALIGN must be set so that the memory allocator
makes sure that kmalloc'ed buffer doesn't share a cache line with
the others. See arch/arm/include/asm/cache.h as an example.
Note that ARCH_DMA_MINALIGN is about DMA memory alignment
constraints. You don't need to worry about the architecture data
alignment constraints (e.g. the alignment constraints about 64-bit
objects).
| Memory accesses by I/O masters can be coherent or non-coherent
| with respect to all hart-related caches.
So the kernel in its default configuration can not assume that DMA is
cache coherent on RISC-V. Making this configurable implies that
a kernel that is configured for cache-coherent machines can no longer
run on all hardware that follows the platform spec.
We have the same problem on arm64, where most of the server parts
are cache coherent, but the majority of the low-end embedded devices
are not, and we require that a single kernel ran run on all of the above.
One idea that we have discussed several times is to start the kernel
without the small kmalloc caches and defer their creation until a
later point in the boot process after determining whether any
non-coherent devices have been discovered. Any in-kernel structures
that have an explicit ARCH_DMA_MINALIGN alignment won't
benefit from this, but any subsequent kmalloc() calls can use the
smaller caches. The tricky bit is finding out whether /everything/ on
the system is cache-coherent or not, since we do not have a global
flag for that in the DT. See [2] for a recent discussion.
Arnd
[1] https://github.com/riscv/riscv-platform-specs/blob/main/riscv-platform-spec.adoc#architecture
[2] https://lore.kernel.org/linux-arm-kernel/20210527124356.22367-1-will@xxxxxxxxxx/