Re: [PATCH 0/5] iommu/vt-d: Consolidate various cache flush ops

From: Lu Baolu
Date: Tue Dec 03 2019 - 19:27:58 EST


Hi David,

On 12/3/19 4:49 PM, David Woodhouse wrote:
On Fri, 2019-11-22 at 11:04 +0800, Lu Baolu wrote:
Intel VT-d 3.0 introduces more caches and interfaces for software to
flush when it runs in the scalable mode. Currently various cache flush
helpers are scattered around. This consolidates them by putting them in
the existing iommu_flush structure.

/* struct iommu_flush - Intel IOMMU cache invalidation ops
*
* @cc_inv: invalidate context cache
* @iotlb_inv: Invalidate IOTLB and paging structure caches when software
* has changed second-level tables.
* @p_iotlb_inv: Invalidate IOTLB and paging structure caches when software
* has changed first-level tables.
* @pc_inv: invalidate pasid cache
* @dev_tlb_inv: invalidate cached mappings used by requests-without-PASID
* from the Device-TLB on a endpoint device.
* @p_dev_tlb_inv: invalidate cached mappings used by requests-with-PASID
* from the Device-TLB on an endpoint device
*/
struct iommu_flush {
void (*cc_inv)(struct intel_iommu *iommu, u16 did,
u16 sid, u8 fm, u64 type);
void (*iotlb_inv)(struct intel_iommu *iommu, u16 did, u64 addr,
unsigned int size_order, u64 type);
void (*p_iotlb_inv)(struct intel_iommu *iommu, u16 did, u32 pasid,
u64 addr, unsigned long npages, bool ih);
void (*pc_inv)(struct intel_iommu *iommu, u16 did, u32 pasid,
u64 granu);
void (*dev_tlb_inv)(struct intel_iommu *iommu, u16 sid, u16 pfsid,
u16 qdep, u64 addr, unsigned int mask);
void (*p_dev_tlb_inv)(struct intel_iommu *iommu, u16 sid, u16 pfsid,
u32 pasid, u16 qdep, u64 addr,
unsigned long npages);
};

The name of each cache flush ops is defined according to the spec section 6.5
so that people are easy to look up them in the spec.

Hm, indirect function calls are quite expensive these days.

Good consideration. Thanks!


I would have preferred to go in the opposite direction, since surely
aren't going to have *many* of these implementations. Currently there's
only one for register-based and one for queued invalidation, right?
Even if VT-d 3.0 throws an extra version in, I think I'd prefer to take
out the indirection completely and have an if/then helper.

Would love to see a microbenchmark of unmap operations before and after
this patch series with retpoline enabled, to see the effect.

Yes. Need some micro-bench tests to address the concern.

Best regards,
baolu