RE: [RFC 10/20] iommu/iommufd: Add IOMMU_DEVICE_GET_INFO

From: Tian, Kevin
Date: Wed Sep 29 2021 - 04:48:44 EST


+Robin.

> From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Sent: Thursday, September 23, 2021 8:22 PM
>
> On Thu, Sep 23, 2021 at 12:05:29PM +0000, Tian, Kevin wrote:
> > > From: Jason Gunthorpe <jgg@xxxxxxxxxx>
> > > Sent: Thursday, September 23, 2021 7:27 PM
> > >
> > > On Thu, Sep 23, 2021 at 11:15:24AM +0100, Jean-Philippe Brucker wrote:
> > >
> > > > So we can only tell userspace "No_snoop is not supported" (provided
> we
> > > > even want to allow them to enable No_snoop). Users in control of
> stage-1
> > > > tables can create non-cacheable mappings through MAIR attributes.
> > >
> > > My point is that ARM is using IOMMU_CACHE to control the overall
> > > cachability of the DMA
> > >
> > > ie not specifying IOMMU_CACHE requires using the arch specific DMA
> > > cache flushers.
> > >
> > > Intel never uses arch specifc DMA cache flushers, and instead is
> > > abusing IOMMU_CACHE to mean IOMMU_BLOCK_NO_SNOOP on DMA
> that
> > > is always
> > > cachable.
> >
> > it uses IOMMU_CACHE to force all DMAs to snoop, including those which
> > has non_snoop flag and wouldn't snoop cache if iommu is disabled.
> Nothing
> > is blocked.
>
> I see it differently, on Intel the only way to bypass the cache with
> DMA is to specify the no-snoop bit in the TLP. The IOMMU PTE flag we
> are talking about tells the IOMMU to ignore the no snoop bit.
>
> Again, Intel arch in the kernel does not support the DMA cache flush
> arch API and *DOES NOT* support incoherent DMA at all.
>
> ARM *does* implement the DMA cache flush arch API and is using
> IOMMU_CACHE to control if the caller will, or will not call the cache
> flushes.

I still didn't fully understand this point after reading the code. Looking
at dma-iommu its cache flush functions are all coded with below as
the first check:

if (dev_is_dma_coherent(dev) && !dev_is_untrusted(dev))
return;

dev->dma_coherent is initialized upon firmware info, not decided by
IOMMU_CACHE.

i.e. it's not IOMMU_CACHE to decide whether cache flushes should
be called.

Probably the confusion comes from __iommu_dma_alloc_noncontiguous:

if (!(ioprot & IOMMU_CACHE)) {
struct scatterlist *sg;
int i;

for_each_sg(sgt->sgl, sg, sgt->orig_nents, i)
arch_dma_prep_coherent(sg_page(sg), sg->length);
}

Here it makes more sense to be if (!coherent) {}.

with above being corrected, I think all iommu drivers do associate
IOMMU_CACHE to the snoop aspect:

Intel:
- either force snooping by ignoring snoop bit in TLP (IOMMU_CACHE)
- or has snoop decided by TLP (!IOMMU_CACHE)

ARM:
- set to snoop format if IOMMU_CACHE
- set to nonsnoop format if !IOMMU_CACHE
(in both cases TLP snoop bit is ignored?)

Other archs
- ignore IOMMU_CACHE as cache is always snooped via their IOMMUs

>
> This is fundamentally different from what Intel is using it for.
>
> > but why do you call it abuse? IOMMU_CACHE was first introduced for
> > Intel platform:
>
> IMHO ARM changed the meaning when Robin linked IOMMU_CACHE to
> dma_is_coherent stuff. At that point it became linked to 'do I need to
> call arch cache flushers or not'.
>

I didn't identify the exact commit for above meaning change.

Robin, could you help share some thoughts here?

Thanks
Kevin