[PATCH] iommu: Optimise PCI SAC address trick

From: Robin Murphy
Date: Fri Sep 16 2022 - 14:53:54 EST


Per the resasoning in commit 4bf7fda4dce2 ("iommu/dma: Add config for
PCI SAC address trick") and its subsequent revert, this mechanism no
longer serves it original purpose, but now only works around broken
hardware/drivers in a way that is unfortunately too impactful to remove.

This does not, however prevent us from solving the performance impact
which that workaround has on large-scale systems that don't need it.
That kicks in once the 32-bit IOVA space fills up and we keep
unsuccessfully trying to allocate from it. However, if we get to that
point then in fact it's already the endgame. The nature of the allocator
is such that the first IOVA we give to a device after the 32-bit space
runs out will be the highest possible address for that device, ever.
If that works, then great, we know we can optimise for speed by always
allocating from the full range. And if it doesn't, then the worst has
already happened and any brokenness is now showing, so there's no point
continuing to try to hide it.

To that end, implement a flag to refine this into a per-device policy
that can automatically get itself out of the way if and when it stops
being useful.

CC: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Robin Murphy <robin.murphy@xxxxxxx>
---
drivers/iommu/dma-iommu.c | 5 ++++-
drivers/iommu/iommu.c | 3 +++
include/linux/iommu.h | 2 ++
3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 9297b741f5e8..1cebb16faa33 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -643,9 +643,12 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain,
dma_limit = min(dma_limit, (u64)domain->geometry.aperture_end);

/* Try to get PCI devices a SAC address */
- if (dma_limit > DMA_BIT_MASK(32) && !iommu_dma_forcedac && dev_is_pci(dev))
+ if (dma_limit > DMA_BIT_MASK(32) && dev->iommu->pci_workaround) {
iova = alloc_iova_fast(iovad, iova_len,
DMA_BIT_MASK(32) >> shift, false);
+ if (!iova)
+ dev->iommu->pci_workaround = false;
+ }

if (!iova)
iova = alloc_iova_fast(iovad, iova_len, dma_limit >> shift,
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index edc768bf8976..19d0a6daae73 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -323,6 +323,9 @@ static int __iommu_probe_device(struct device *dev, struct list_head *group_list

iommu_device_link(iommu_dev, dev);

+ if (!iommu_dma_forcedac && dev_is_pci(dev))
+ dev->iommu->pci_workaround = true;
+
return 0;

out_release:
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 79cb6eb560a8..0eb0f808109c 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -368,6 +368,7 @@ struct iommu_fault_param {
* @fwspec: IOMMU fwspec data
* @iommu_dev: IOMMU device this device is linked to
* @priv: IOMMU Driver private data
+ * @pci_workaround: Limit DMA allocations to 32-bit IOVAs
*
* TODO: migrate other per device data pointers under iommu_dev_data, e.g.
* struct iommu_group *iommu_group;
@@ -379,6 +380,7 @@ struct dev_iommu {
struct iommu_fwspec *fwspec;
struct iommu_device *iommu_dev;
void *priv;
+ bool pci_workaround;
};

int iommu_device_register(struct iommu_device *iommu,
--
2.36.1.dirty