Re: [PATCH V10 08/11] iommu/vt-d: Add svm/sva invalidate function
From: Jacob Pan
Date: Tue Mar 31 2020 - 16:52:22 EST
On Tue, 31 Mar 2020 02:49:21 +0000
"Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
> > From: Auger Eric <eric.auger@xxxxxxxxxx>
> > Sent: Sunday, March 29, 2020 11:34 PM
> >
> > Hi,
> >
> > On 3/28/20 11:01 AM, Tian, Kevin wrote:
> > >> From: Jacob Pan <jacob.jun.pan@xxxxxxxxxxxxxxx>
> > >> Sent: Saturday, March 21, 2020 7:28 AM
> > >>
> > >> When Shared Virtual Address (SVA) is enabled for a guest OS via
> > >> vIOMMU, we need to provide invalidation support at IOMMU API
> > >> and
> > driver
> > >> level. This patch adds Intel VT-d specific function to implement
> > >> iommu passdown invalidate API for shared virtual address.
> > >>
> > >> The use case is for supporting caching structure invalidation
> > >> of assigned SVM capable devices. Emulated IOMMU exposes queue
> [...]
> [...]
> > to
> > >> + * VT-d granularity. Invalidation is typically included in the
> > >> unmap
> > operation
> > >> + * as a result of DMA or VFIO unmap. However, for assigned
> > >> devices
> > guest
> > >> + * owns the first level page tables. Invalidations of
> > >> translation caches in
> > the
> [...]
> [...]
> [...]
> > inv_type_granu_map[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_
> > >> NR] = {
> > >> + /*
> > >> + * PASID based IOTLB invalidation: PASID selective (per
> > >> PASID),
> > >> + * page selective (address granularity)
> > >> + */
> > >> + {0, 1, 1},
> > >> + /* PASID based dev TLBs, only support all PASIDs or
> > >> single PASID */
> > >> + {1, 1, 0},
> > >
> > > Is this combination correct? when single PASID is being
> > > specified, it is essentially a page-selective invalidation since
> > > you need provide Address and Size.
> > Isn't it the same when G=1? Still the addr/size is used. Doesn't
> > it
>
> I thought addr/size is not used when G=1, but it might be wrong. I'm
> checking with our vt-d spec owner.
>
> > correspond to IOMMU_INV_GRANU_ADDR with
> > IOMMU_INV_ADDR_FLAGS_PASID flag
> > unset?
> >
> > so {0, 0, 1}?
>
I am not sure I got your logic. The three fields correspond to
IOMMU_INV_GRANU_DOMAIN, /* domain-selective invalidation */
IOMMU_INV_GRANU_PASID, /* PASID-selective invalidation */
IOMMU_INV_GRANU_ADDR, /* page-selective invalidation *
For devTLB, we use domain as global since there is no domain. Then I
came up with {1, 1, 0}, which means we could have global and pasid
granu invalidation for PASID based devTLB.
If the caller also provide addr and S bit, the flush routine will put
that into QI descriptor. I know this is a little odd, but from the
granu translation p.o.v. VT-d spec has no G bit for page selective
invalidation.
> I have one more open:
>
> How does userspace know which invalidation type/gran is supported?
> I didn't see such capability reporting in Yi's VFIO vSVA patch set.
> Do we want the user/kernel assume the same capability set if they are
> architectural? However the kernel could also do some optimization
> e.g. hide devtlb invalidation capability given that the kernel
> already invalidate devtlb automatically when serving iotlb
> invalidation...
>
In general, we are trending to use VFIO capability chain to expose iommu
capabilities.
But for architectural features such as type/granu, we have to assume
the same capability between host & guest. Granu and types are not
enumerated on the host IOMMU either.
For devTLB optimization, I agree we need to expose a capability to
the guest stating that implicit devtlb invalidation is supported.
Otherwise, if Linux guest runs on other OSes may not support implicit
devtlb invalidation.
Right Yi?
> Thanks
> Kevin
>
> >
> > Thanks
> >
> > Eric
> >
> > >
> > >> + /* PASID cache */
> > >
> > > PASID cache is fully managed by the host. Guest PASID cache
> > > invalidation is interpreted by vIOMMU for bind and unbind
> > > operations. I don't think we should accept any PASID cache
> > > invalidation from userspace or guest.
> [...]
> > inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU
> [...]
> > >
> > > btw do we really need both map and table here? Can't we just
> > > use one table with unsupported granularity marked as a special
> > > value?
> > >
> [...]
> > >
> > > -ENOTSUPP?
> > >
> [...]
> > >
> > > granularity == IOMMU_INV_GRANU_ADDR? otherwise it's unclear
> > > why IOMMU_INV_GRANU_DOMAIN also needs size check.
> > >
> [...]
> > >>> addr_info.addr),
> [...]
> [...]
> > >> + if (info->ats_enabled) {
> > >> + qi_flush_dev_iotlb_pasid(iommu,
> > >> sid, info-
> > >>> pfsid,
> [...]
> > >>> pfsid,
> > >> +
> > >> inv_info->addr_info.pasid, info->ats_qdep,
> > >> +
> > >> inv_info->addr_info.addr, size,
> > >> + granu);
> [...]
> [...]
> > >>> pasid_info.pasid);
> > >> +
> > >
> > > as earlier comment, we shouldn't allow userspace or guest to
> > > invalidate PASID cache
> > >
> [...]
> > >
>
[Jacob Pan]