RE: [PATCH v3 5/6] iommu/vt-d: Flush PASID-based iotlb for iova over first level
From: Liu, Yi L
Date: Sun Dec 15 2019 - 04:23:11 EST
Hi Baolu,
Please check replies below:
> From: Lu Baolu [mailto:baolu.lu@xxxxxxxxxxxxxxx]
> Sent: Saturday, December 14, 2019 11:24 AM
> To: Liu, Yi L <yi.l.liu@xxxxxxxxx>; Joerg Roedel <joro@xxxxxxxxxx>; David
> Woodhouse <dwmw2@xxxxxxxxxxxxx>; Alex Williamson
> <alex.williamson@xxxxxxxxxx>
> Subject: Re: [PATCH v3 5/6] iommu/vt-d: Flush PASID-based iotlb for iova over first
> level
>
> Hi Liu Yi,
>
> On 12/13/19 7:42 PM, Liu, Yi L wrote:
> >> From: kvm-owner@xxxxxxxxxxxxxxx [mailto:kvm-owner@xxxxxxxxxxxxxxx] On
> Behalf
> >> Of Lu Baolu
> >> Sent: Wednesday, December 11, 2019 10:12 AM
> >> To: Joerg Roedel <joro@xxxxxxxxxx>; David Woodhouse
> <dwmw2@xxxxxxxxxxxxx>;
> >> Subject: [PATCH v3 5/6] iommu/vt-d: Flush PASID-based iotlb for iova over first
> level
> >>
> >> When software has changed first-level tables, it should invalidate
> >> the affected IOTLB and the paging-structure-caches using the PASID-
> >> based-IOTLB Invalidate Descriptor defined in spec 6.5.2.4.
> >>
> >> Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> >> ---
> >> drivers/iommu/dmar.c | 41 ++++++++++++++++++++++++++++++++++
> >> drivers/iommu/intel-iommu.c | 44 ++++++++++++++++++++++++-------------
> >> include/linux/intel-iommu.h | 2 ++
> >> 3 files changed, 72 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> >> index 3acfa6a25fa2..fb30d5053664 100644
> >> --- a/drivers/iommu/dmar.c
> >> +++ b/drivers/iommu/dmar.c
> >> @@ -1371,6 +1371,47 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu,
> u16
> >> sid, u16 pfsid,
> >> qi_submit_sync(&desc, iommu);
> >> }
> >>
> >> +/* PASID-based IOTLB invalidation */
> >> +void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
> >> + unsigned long npages, bool ih)
> >> +{
> >> + struct qi_desc desc = {.qw2 = 0, .qw3 = 0};
> >> +
> >> + /*
> >> + * npages == -1 means a PASID-selective invalidation, otherwise,
> >> + * a positive value for Page-selective-within-PASID invalidation.
> >> + * 0 is not a valid input.
> >> + */
> >> + if (WARN_ON(!npages)) {
> >> + pr_err("Invalid input npages = %ld\n", npages);
> >> + return;
> >> + }
> >> +
> >> + if (npages == -1) {
> >> + desc.qw0 = QI_EIOTLB_PASID(pasid) |
> >> + QI_EIOTLB_DID(did) |
> >> + QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) |
> >> + QI_EIOTLB_TYPE;
> >> + desc.qw1 = 0;
> >> + } else {
> >> + int mask = ilog2(__roundup_pow_of_two(npages));
> >> + unsigned long align = (1ULL << (VTD_PAGE_SHIFT + mask));
> >> +
> >> + if (WARN_ON_ONCE(!ALIGN(addr, align)))
> >> + addr &= ~(align - 1);
> >> +
> >> + desc.qw0 = QI_EIOTLB_PASID(pasid) |
> >> + QI_EIOTLB_DID(did) |
> >> + QI_EIOTLB_GRAN(QI_GRAN_PSI_PASID) |
> >> + QI_EIOTLB_TYPE;
> >> + desc.qw1 = QI_EIOTLB_ADDR(addr) |
> >> + QI_EIOTLB_IH(ih) |
> >> + QI_EIOTLB_AM(mask);
> >> + }
> >> +
> >> + qi_submit_sync(&desc, iommu);
> >> +}
> >> +
> >> /*
> >> * Disable Queued Invalidation interface.
> >> */
> >> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> >> index 83a7abf0c4f0..e47f5fe37b59 100644
> >> --- a/drivers/iommu/intel-iommu.c
> >> +++ b/drivers/iommu/intel-iommu.c
> >> @@ -1520,18 +1520,24 @@ static void iommu_flush_iotlb_psi(struct
> intel_iommu
> >> *iommu,
> >>
> >> if (ih)
> >> ih = 1 << 6;
> >> - /*
> >> - * Fallback to domain selective flush if no PSI support or the size is
> >> - * too big.
> >> - * PSI requires page size to be 2 ^ x, and the base address is naturally
> >> - * aligned to the size
> >> - */
> >> - if (!cap_pgsel_inv(iommu->cap) || mask > cap_max_amask_val(iommu-
> >>> cap))
> >> - iommu->flush.flush_iotlb(iommu, did, 0, 0,
> >> - DMA_TLB_DSI_FLUSH);
> >> - else
> >> - iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
> >> - DMA_TLB_PSI_FLUSH);
> >> +
> >> + if (domain_use_first_level(domain)) {
> >> + qi_flush_piotlb(iommu, did, domain->default_pasid,
> >> + addr, pages, ih);
> >
> > I'm not sure if my understanding is correct. But let me tell a story.
> > Assuming we assign a mdev and a PF/VF to a single VM, then there
> > will be p_iotlb tagged with PASID_RID2PASID and p_iotlb tagged with
> > default_pasid. We may want to flush both... If this operation is
>
> I assume that SRIOV and SIOV are exclusive. You can't enable both SRIOV
> and SIOV on a single device.
yes, but I'm not talking use them on a single device...
> So the mdev and PF/VF are from different
> devices, right?
yes, the case I mentioned above is: a mdev from a device (say devA),
and another device (say devB). Create mdev on devA and assign it to
a VM together with devB.
>
> Or, in SRIOV case, you can wrap a PF or VF as a mediated device. But
> this mdev still be backed with a pasid of RID2PASID.
My comment has no business with wrapping PF/VR as mdev...
>
> > invoked per-device, then need to pass in a hint to indicate whether
> > to use PASID_RID2PASID or default_pasid, or you may just issue two
> > flush with the two PASID values. Thoughts?
>
> This is per-domain and each domain has specific domain id and default
> pasid (assume default domain is 0 in RID2PASID case).
Ok, let me explain more... default pasid is meaningful only when
the domain has been attached to a device as an aux-domain. right?
If a domain only has one device, and it is attached to this device as
normal domain (normal domain means non aux-domain here). Then
you should flush cache with domain-id and RID2PASID value.
If a domain has one device, and it is attached to this device as
aux-domain. Then you may want to flush cache with domain-id
and default pasid. right?
Then let's come to the case I mentioned in previous email. a mdev
and another device assigned to a single VM. In host, you will have
a domain which has two devices, one device(deva) is attached as
normal domain, another one (devB) is attached as aux-domain. Then
which pasid should be used when the mapping in IOVA page table is
modified? RID2PASID or default pasid? I think both should be used
since the domain means differently to the two devices. If you just
use default pasid, then deva may still be able to use stale caches.
Regards,
Yi Liu