[PATCH v3 0/4] Optimise 64-bit IOVA allocations

From: Robin Murphy
Date: Tue Aug 22 2017 - 11:17:57 EST


Hi all,

Just a quick repost of v2[1] with a small fix for the bug reported by Nate.
To recap, whilst this mostly only improves worst-case performance, those
worst-cases have a tendency to be pathologically bad:

Ard reports general desktop performance with Chromium on AMD Seattle going
from ~1-2FPS to perfectly usable.

Leizhen reports gigabit ethernet throughput going from ~6.5Mbit/s to line
speed.

I also inadvertantly found that the HiSilicon hns_dsaf driver was taking ~35s
to probe simply becuase of the number of DMA buffers it maps on startup (perf
shows around 76% of that was spent under the lock in alloc_iova()). With this
series applied it takes a mere ~1s, mostly of unrelated mdelay()s, with
alloc_iova() entirely lost in the noise.

Robin.

[1] https://www.mail-archive.com/iommu@xxxxxxxxxxxxxxxxxxxxxxxxxx/msg19139.html

Robin Murphy (1):
iommu/iova: Extend rbtree node caching

Zhen Lei (3):
iommu/iova: Optimise rbtree searching
iommu/iova: Optimise the padding calculation
iommu/iova: Make dma_32bit_pfn implicit

drivers/gpu/drm/tegra/drm.c | 3 +-
drivers/gpu/host1x/dev.c | 3 +-
drivers/iommu/amd_iommu.c | 7 +--
drivers/iommu/dma-iommu.c | 18 +------
drivers/iommu/intel-iommu.c | 11 ++--
drivers/iommu/iova.c | 114 +++++++++++++++++----------------------
drivers/misc/mic/scif/scif_rma.c | 3 +-
include/linux/iova.h | 8 +--
8 files changed, 62 insertions(+), 105 deletions(-)

--
2.13.4.dirty