RE: [PATCH v3 5/6] iommu/vt-d: Flush PASID-based iotlb for iova over first level

From: Liu, Yi L
Date: Fri Dec 13 2019 - 06:42:45 EST


Hi Allen,

> From: kvm-owner@xxxxxxxxxxxxxxx [mailto:kvm-owner@xxxxxxxxxxxxxxx] On Behalf
> Of Lu Baolu
> Sent: Wednesday, December 11, 2019 10:12 AM
> To: Joerg Roedel <joro@xxxxxxxxxx>; David Woodhouse <dwmw2@xxxxxxxxxxxxx>;
> Subject: [PATCH v3 5/6] iommu/vt-d: Flush PASID-based iotlb for iova over first level
>
> When software has changed first-level tables, it should invalidate
> the affected IOTLB and the paging-structure-caches using the PASID-
> based-IOTLB Invalidate Descriptor defined in spec 6.5.2.4.
>
> Signed-off-by: Lu Baolu <baolu.lu@xxxxxxxxxxxxxxx>
> ---
> drivers/iommu/dmar.c | 41 ++++++++++++++++++++++++++++++++++
> drivers/iommu/intel-iommu.c | 44 ++++++++++++++++++++++++-------------
> include/linux/intel-iommu.h | 2 ++
> 3 files changed, 72 insertions(+), 15 deletions(-)
>
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index 3acfa6a25fa2..fb30d5053664 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -1371,6 +1371,47 @@ void qi_flush_dev_iotlb(struct intel_iommu *iommu, u16
> sid, u16 pfsid,
> qi_submit_sync(&desc, iommu);
> }
>
> +/* PASID-based IOTLB invalidation */
> +void qi_flush_piotlb(struct intel_iommu *iommu, u16 did, u32 pasid, u64 addr,
> + unsigned long npages, bool ih)
> +{
> + struct qi_desc desc = {.qw2 = 0, .qw3 = 0};
> +
> + /*
> + * npages == -1 means a PASID-selective invalidation, otherwise,
> + * a positive value for Page-selective-within-PASID invalidation.
> + * 0 is not a valid input.
> + */
> + if (WARN_ON(!npages)) {
> + pr_err("Invalid input npages = %ld\n", npages);
> + return;
> + }
> +
> + if (npages == -1) {
> + desc.qw0 = QI_EIOTLB_PASID(pasid) |
> + QI_EIOTLB_DID(did) |
> + QI_EIOTLB_GRAN(QI_GRAN_NONG_PASID) |
> + QI_EIOTLB_TYPE;
> + desc.qw1 = 0;
> + } else {
> + int mask = ilog2(__roundup_pow_of_two(npages));
> + unsigned long align = (1ULL << (VTD_PAGE_SHIFT + mask));
> +
> + if (WARN_ON_ONCE(!ALIGN(addr, align)))
> + addr &= ~(align - 1);
> +
> + desc.qw0 = QI_EIOTLB_PASID(pasid) |
> + QI_EIOTLB_DID(did) |
> + QI_EIOTLB_GRAN(QI_GRAN_PSI_PASID) |
> + QI_EIOTLB_TYPE;
> + desc.qw1 = QI_EIOTLB_ADDR(addr) |
> + QI_EIOTLB_IH(ih) |
> + QI_EIOTLB_AM(mask);
> + }
> +
> + qi_submit_sync(&desc, iommu);
> +}
> +
> /*
> * Disable Queued Invalidation interface.
> */
> diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
> index 83a7abf0c4f0..e47f5fe37b59 100644
> --- a/drivers/iommu/intel-iommu.c
> +++ b/drivers/iommu/intel-iommu.c
> @@ -1520,18 +1520,24 @@ static void iommu_flush_iotlb_psi(struct intel_iommu
> *iommu,
>
> if (ih)
> ih = 1 << 6;
> - /*
> - * Fallback to domain selective flush if no PSI support or the size is
> - * too big.
> - * PSI requires page size to be 2 ^ x, and the base address is naturally
> - * aligned to the size
> - */
> - if (!cap_pgsel_inv(iommu->cap) || mask > cap_max_amask_val(iommu-
> >cap))
> - iommu->flush.flush_iotlb(iommu, did, 0, 0,
> - DMA_TLB_DSI_FLUSH);
> - else
> - iommu->flush.flush_iotlb(iommu, did, addr | ih, mask,
> - DMA_TLB_PSI_FLUSH);
> +
> + if (domain_use_first_level(domain)) {
> + qi_flush_piotlb(iommu, did, domain->default_pasid,
> + addr, pages, ih);

I'm not sure if my understanding is correct. But let me tell a story.
Assuming we assign a mdev and a PF/VF to a single VM, then there
will be p_iotlb tagged with PASID_RID2PASID and p_iotlb tagged with
default_pasid. We may want to flush both... If this operation is
invoked per-device, then need to pass in a hint to indicate whether
to use PASID_RID2PASID or default_pasid, or you may just issue two
flush with the two PASID values. Thoughts?

Regards,
Yi Liu