-----Original Message-----It's a good idea to use the above logic to calculate the appropriate mask for non-size-aligned page selective invalidation for all domains, including sva domain. Sounds like we plan to allow non-size-aligned page selection invalidation.
From: Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx>
Sent: Wednesday, April 10, 2024 10:09 AM
To: Joerg Roedel<joro@xxxxxxxxxx>; Will Deacon<will@xxxxxxxxxx>; Robin
Murphy<robin.murphy@xxxxxxx>; Tian, Kevin<kevin.tian@xxxxxxxxx>;
Jason Gunthorpe<jgg@xxxxxxxx>
Cc: Zhang, Tina<tina.zhang@xxxxxxxxx>; Liu, Yi L<yi.l.liu@xxxxxxxxx>;
iommu@xxxxxxxxxxxxxxx;linux-kernel@xxxxxxxxxxxxxxx; Lu Baolu
<baolu.lu@xxxxxxxxxxxxxxx>
Subject: [PATCH v2 02/12] iommu/vt-d: Add cache tag invalidation helpers
Add several helpers to invalidate the caches after mappings in the affected
domain are changed.
- cache_tag_flush_range() invalidates a range of caches after mappings
within this range are changed. It uses the page-selective cache
invalidation methods.
- cache_tag_flush_all() invalidates all caches tagged by a domain ID.
It uses the domain-selective cache invalidation methods.
- cache_tag_flush_range_np() invalidates a range of caches when new
mappings are created in the domain and the corresponding page table
entries change from non-present to present.
Signed-off-by: Lu Baolu<baolu.lu@xxxxxxxxxxxxxxx>
---
drivers/iommu/intel/iommu.h | 14 +++
drivers/iommu/intel/cache.c | 195
++++++++++++++++++++++++++++++++++++
drivers/iommu/intel/iommu.c | 12 ---
3 files changed, 209 insertions(+), 12 deletions(-)
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index 6f6bffc60852..700574421b51 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -35,6 +35,8 @@
#define VTD_PAGE_MASK (((u64)-1) << VTD_PAGE_SHIFT)
#define VTD_PAGE_ALIGN(addr) (((addr) + VTD_PAGE_SIZE - 1) &
VTD_PAGE_MASK)
+#define IOVA_PFN(addr) ((addr) >> PAGE_SHIFT)
+
#define VTD_STRIDE_SHIFT (9)
#define VTD_STRIDE_MASK (((u64)-1) << VTD_STRIDE_SHIFT)
@@ -1041,6 +1043,13 @@ static inline void context_set_sm_pre(struct
context_entry *context)
context->lo |= BIT_ULL(4);
}
+/* Returns a number of VTD pages, but aligned to MM page size */ static
+inline unsigned long aligned_nrpages(unsigned long host_addr, size_t
+size) {
+ host_addr &= ~PAGE_MASK;
+ return PAGE_ALIGN(host_addr + size) >> VTD_PAGE_SHIFT; }
+
/* Convert value to context PASID directory size field coding. */
#define context_pdts(pds) (((pds) & 0x7) << 9)
@@ -1116,6 +1125,11 @@ int cache_tag_assign_domain(struct
dmar_domain *domain, u16 did,
struct device *dev, ioasid_t pasid); void
cache_tag_unassign_domain(struct dmar_domain *domain, u16 did,
struct device *dev, ioasid_t pasid);
+void cache_tag_flush_range(struct dmar_domain *domain, unsigned long
start,
+ unsigned long end, int ih);
+void cache_tag_flush_all(struct dmar_domain *domain); void
+cache_tag_flush_range_np(struct dmar_domain *domain, unsigned long
start,
+ unsigned long end);
#ifdef CONFIG_INTEL_IOMMU_SVM
void intel_svm_check(struct intel_iommu *iommu); diff --git
a/drivers/iommu/intel/cache.c b/drivers/iommu/intel/cache.c index
debbdaeff1c4..b2270dc8a765 100644
--- a/drivers/iommu/intel/cache.c
+++ b/drivers/iommu/intel/cache.c
@@ -12,6 +12,7 @@
#include <linux/dmar.h>
#include <linux/iommu.h>
#include <linux/memory.h>
+#include <linux/pci.h>
#include <linux/spinlock.h>
#include "iommu.h"
@@ -194,3 +195,197 @@ void cache_tag_unassign_domain(struct
dmar_domain *domain, u16 did,
if (domain->domain.type == IOMMU_DOMAIN_NESTED)
__cache_tag_unassign_parent_domain(domain->s2_domain,
did, dev, pasid); }
+
+static unsigned long calculate_psi_aligned_address(unsigned long start,
+ unsigned long end,
+ unsigned long *_pages,
+ unsigned long *_mask)
+{
+ unsigned long pages = aligned_nrpages(start, end - start + 1);
+ unsigned long aligned_pages = __roundup_pow_of_two(pages);
+ unsigned long bitmask = aligned_pages - 1;
+ unsigned long mask = ilog2(aligned_pages);
+ unsigned long pfn = IOVA_PFN(start);
+
+ /*
+ * PSI masks the low order bits of the base address. If the
+ * address isn't aligned to the mask, then compute a mask value
+ * needed to ensure the target range is flushed.
+ */
+ if (unlikely(bitmask & pfn)) {
+ unsigned long end_pfn = pfn + pages - 1, shared_bits;
+
+ /*
+ * Since end_pfn <= pfn + bitmask, the only way bits
+ * higher than bitmask can differ in pfn and end_pfn is
+ * by carrying. This means after masking out bitmask,
+ * high bits starting with the first set bit in
+ * shared_bits are all equal in both pfn and end_pfn.
+ */
+ shared_bits = ~(pfn ^ end_pfn) & ~bitmask;
+ mask = shared_bits ? __ffs(shared_bits) : BITS_PER_LONG;
+ }
+
+ *_pages = aligned_pages;
+ *_mask = mask;
+
+ return ALIGN_DOWN(start, VTD_PAGE_SIZE); }
However, currently there are some places in the code which have the assumption that the size of the page selective invalidation should be aligned with the start address, a.k.a. only size-aligned page selective invalidation should happen and non-size-aligned one isn't expected.
One example is in qi_flush_dev_iotlb_pasid() and there is a checking:
if (!IS_ALIGNED(addr, VTD_PAGE_SIZE << size_order)
pr_warn_ratelimited(...)
If the non-size-aligned page selective invalidation is allowed, the warning message may come out which sounds confusing and impacts performance.
Another example is in qi_flush_piotlb() and there is a WARN_ON_ONCE() to remind user when non-size-aligned page selective invalidation is being used.
If (WARN_ON_ONCE(!IS_ALIGNED(addr, align))
...
So, can we consider removing the checking as well in this patch-set?