Re: [PATCH] Revert "arm64: dma: Drop cache invalidation from arch_dma_prep_coherent()"
From: Catalin Marinas
Date: Thu Nov 24 2022 - 06:55:14 EST
On Mon, Nov 21, 2022 at 03:42:27PM +0530, Sibi Sankar wrote:
> On 11/21/22 12:12, Manivannan Sadhasivam wrote:
> > On Fri, Nov 18, 2022 at 12:33:49PM +0000, Will Deacon wrote:
> > > On Fri, Nov 18, 2022 at 04:24:02PM +0530, Manivannan Sadhasivam wrote:
> > > > On Mon, Nov 14, 2022 at 05:38:00PM +0000, Catalin Marinas wrote:
> > > > > On Mon, Nov 14, 2022 at 03:14:21PM +0000, Robin Murphy wrote:
> > > > > > Clearly that driver is completely broken though. If the DMA allocation came
> > > > > > from a no-map carveout vma_dma_alloc_from_dev_coherent() then the vmap()
> > > > > > shenanigans wouldn't work, so if it backed by struct pages then the whole
> > > > > > dance is still pointless because *a cacheable linear mapping exists*, and
> > > > > > it's just relying on the reduced chance that anything's going to re-fetch
> > > > > > the linear map address after those pages have been allocated, exactly as I
> > > > > > called out previously[1].
> > > > >
> > > > > So I guess a DMA pool that's not mapped in the linear map, together with
> > > > > memremap() instead of vmap(), would work around the issue. But the
> > > > > driver needs fixing, not the arch code.
> > > >
> > > > Okay, thanks for the hint. Can you share how to allocate the dma-pool that's
> > > > not part of the kernel's linear map? I looked into it but couldn't find a way.
> > >
> > > The no-map property should take care of this iirc
> >
> > Yeah, we have been using it in other places of the same driver. But as per
> > Sibi, we used dynamic allocation for metadata validation since there was no
> > memory reserved statically for that.
>
> Unlike the other portions in the driver that required statically defined
> no-map carveouts, metadata just needed a contiguous memory for
> authentication. Re-using existing carveouts for this metadata region
> may not work due to modem FW limitations and declaring a new carveout for
> metadata will break the device tree bindings. That's the reason for
> using DMA_ATTR_NO_KERNEL_MAPPING for dma_alloc_attr and vmpa/vunmap with
> VM_FLUSH_RESET_PERMS before passing the memory onto modem. Are there other
> suggestions for achieving the same without breaking bindings?
Your DMA_ATTR_NO_KERNEL_MAPPING workaround doesn't work, it only makes
the failure rate smaller. All this attribute does is avoiding creating a
non-cacheable mapping but you still have the kernel linear mapping in
place that may be speculatively accessed by the CPU. You were just lucky
so far not to have hit the issue. So I'd rather see this fixed properly
with a no-map carveout. Maybe you can reuse an existing carveout if the
driver already needs some and avoid changing the DT. More complicated
options include allocating memory and unmapping it from the linear map
with set_memory_valid(), though that's not exported to modules and it
also requires the linear map to be pages only, not block mappings.
Yet another option is to have the swiotlb buffer unmapped from the
kernel linear map and use the bounce buffer for this. That's more
involved (Robin has some patches, though for a different reason and they
may not make it upstream).
--
Catalin