Re: [PATCH v2] dma-direct: improve DMA_ATTR_NO_KERNEL_MAPPING
From: Ard Biesheuvel
Date: Thu Nov 04 2021 - 12:22:42 EST
On Thu, 4 Nov 2021 at 14:40, Walter Wu <walter-zh.wu@xxxxxxxxxxxx> wrote:
>
> On Thu, 2021-11-04 at 13:47 +0100, Ard Biesheuvel wrote:
> > On Thu, 4 Nov 2021 at 13:31, Walter Wu <walter-zh.wu@xxxxxxxxxxxx>
> > wrote:
> > >
> > > On Thu, 2021-11-04 at 09:57 +0100, Ard Biesheuvel wrote:
> > > > On Thu, 4 Nov 2021 at 09:53, Christoph Hellwig <hch@xxxxxx>
> > > > wrote:
> > > > >
> > > > > On Thu, Nov 04, 2021 at 10:32:21AM +0800, Walter Wu wrote:
> > > > > > diff --git a/include/linux/set_memory.h
> > > > > > b/include/linux/set_memory.h
> > > > > > index f36be5166c19..6c7d1683339c 100644
> > > > > > --- a/include/linux/set_memory.h
> > > > > > +++ b/include/linux/set_memory.h
> > > > > > @@ -7,11 +7,16 @@
> > > > > >
> > > > > > #ifdef CONFIG_ARCH_HAS_SET_MEMORY
> > > > > > #include <asm/set_memory.h>
> > > > > > +
> > > > > > +#ifndef CONFIG_RODATA_FULL_DEFAULT_ENABLED
> > > > >
> > > > > This is an arm64-specific symbol, and one that only controls a
> > > > > default. I don't think it is suitable to key off stubs in
> > > > > common
> > > > > code.
> > > > >
> > > > > > +static inline int set_memory_valid(unsigned long addr, int
> > > > > > numpages, int enable) { return 0; }
> > > > >
> > > > > Pleae avoid overly long lines.
> > > > >
> > > > > > + if
> > > > > > (IS_ENABLED(CONFIG_RODATA_FULL_DEFAULT_ENABLED))
> > > > > > {
> > > > > > + kaddr = (unsigned
> > > > > > long)phys_to_virt(dma_to_phys(dev, *dma_handle));
> > > > >
> > > > > This can just use page_address.
> > > > >
> > > > > > + /* page remove kernel mapping for arm64
> > > > > > */
> > > > > > + set_memory_valid(kaddr, size >>
> > > > > > PAGE_SHIFT,
> > > > > > 0);
> > > > > > + }
> > > > >
> > > > > But more importantly: set_memory_valid only exists on arm64,
> > > > > this
> > > > > will break compile everywhere else. And this API is complete
> > > > > crap.
> > > > > Passing kernel virtual addresses as unsigned long just sucks,
> > > > > and
> > > > > passing an integer argument for valid/non-valid also is a
> > > > > horrible
> > > > > API.
> > > > >
> > > >
> > > > ... and as I pointed out before, you can still pass rodata=off on
> > > > arm64, and get the old behavior, in which case bad things will
> > > > happen
> > > > if you try to use an API that expects to operate on page mappings
> > > > with
> > > > a 1 GB block mapping.
> > > >
> > >
> > > Thanks for your suggestion.
> > >
> > >
> > > > And you still haven't explained what the actual problem is: is
> > > > this
> > > > about CPU speculation corrupting non-cache coherent inbound DMA?
> > >
> > > No corrupiton, only cpu read it, we hope to fix the behavior.
> > >
> >
> > Fix which behavior? Please explain
> >
> > 1) the current behavior
> We call dma_direct_alloc() with DMA_ATTR_NO_KERNEL_MAPPING to get the
> allocated buffer and the kernel mapping is exist. Our goal is this
> buffer doesn't allow to be accessed by cpu. Unfortunately, we see cpu
> speculation to read it. So we need to fix it and don't use no-map the
> way.
>
> > 2) why the current behavior is problematic for you
> dma_direct_alloc() with DMA_ATTR_NO_KERNEL_MAPPING have kernel mapping,
> so it still has cpu speculation read the buffer. Although we have
> hardware to protect the buffer, we still hope use software to fix it.
>
But *why* is this a problem? You are saying that the speculative
accesses are not causing corruption, so they are causing other issues
that you want to address. So which issues are we talking about here?
> > 3) how this patch changes the current behavior
> When call dma_direct_alloc() with DMA_ATTR_NO_KERNEL_MAPPING, then
> remove the kernel mapping which belong to the buffer.
>
> > 4) why the new behavior fixes your problem.
> If I understand correctly, want to block cpu speculation, then need
> unmap the buffer at stage 1 and stage 2 page table and tlb invalidate.
> This patch is to do stage 1 unmap at EL1.
>
> >
> > There is no penalty for using too many words.
>
> Thanks.
> Walter
>