Re: [PATCH] iommu/dma: Add support for DMA_ATTR_SYS_CACHE

From: Jordan Crouse
Date: Mon Oct 28 2019 - 11:34:52 EST


On Mon, Oct 28, 2019 at 11:59:04AM +0000, Robin Murphy wrote:
> On 28/10/2019 11:24, Will Deacon wrote:
> >Hi Christoph,
> >
> >On Mon, Oct 28, 2019 at 08:41:56AM +0100, Christoph Hellwig wrote:
> >>On Sat, Oct 26, 2019 at 03:12:57AM -0700, isaacm@xxxxxxxxxxxxxx wrote:
> >>>On 2019-10-25 22:30, Christoph Hellwig wrote:
> >>>>The definition makes very little sense.
> >>>Can you please clarify what part doesnât make sense, and why?
> >>
> >>It looks like complete garbage to me. That might just be because it
> >>uses tons of terms I've never heard of of and which aren't used anywhere
> >>in the DMA API. It also might be because it doesn't explain how the
> >>flag might actually be practically useful.
> >
> >Agreed. The way I /think/ it works is that on many SoCs there is a
> >system/last-level cache (LLC) which effectively sits in front of memory for
> >all masters. Even if a device isn't coherent with the CPU caches, we still
> >want to be able to allocate into the LLC. Why this doesn't happen
> >automatically is beyond me, but it appears that on these Qualcomm designs
> >you actually have to set the memory attributes up in the page-table to
> >ensure that the resulting memory transactions are non-cacheable for the CPU
> >but cacheable for the LLC. Without any changes, the transactions are
> >non-cacheable in both of them which assumedly has a performance cost.
> >
> >But you can see that I'm piecing things together myself here. Isaac?
>
> FWIW, that's pretty much how Pratik and Jordan explained it to me - the LLC
> sits directly in front of memory and is more or less transparent, although
> it might treat CPU and device accesses slightly differently (I don't
> remember exactly how the inner cacheablility attribute interacts). Certain
> devices don't get much benefit from the LLC, hence the desire for
> finer-grained control of their outer allocation policy to avoid more
> thrashing than necessary. Furthermore, for stuff in the video/GPU/display
> area certain jobs benefit more than others, hence the desire to go even
> finer-grained than a per-device control in order to maximise LLC
> effectiveness.

Robin's description is correct. And we did have a patch for an in-kernel user
but it got lost in the wash. I'm hoping Sharat can get a respin in time for 5.5.

https://lore.kernel.org/linux-arm-msm/1538744915-25490-8-git-send-email-smasetty@xxxxxxxxxxxxxx/

Jordan

--
The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
a Linux Foundation Collaborative Project