Re: [PATCH v4 2/6] iova: Allow rcache range upper limit to be flexible

From: Robin Murphy
Date: Mon Aug 02 2021 - 12:09:48 EST


On 2021-08-02 16:23, John Garry wrote:
On 02/08/2021 16:01, Will Deacon wrote:
On Wed, Jul 14, 2021 at 06:36:39PM +0800, John Garry wrote:
Some LLDs may request DMA mappings whose IOVA length exceeds that of the
current rcache upper limit.

What's an LLD?


low-level driver

maybe I'll stick with simply "drivers"

This means that allocations for those IOVAs will never be cached, and
always must be allocated and freed from the RB tree per DMA mapping cycle.
This has a significant effect on performance, more so since commit
4e89dce72521 ("iommu/iova: Retry from last rb tree node if iova search
fails"), as discussed at [0].

As a first step towards allowing the rcache range upper limit be
configured, hold this value in the IOVA rcache structure, and allocate
the rcaches separately.

[0] https://lore.kernel.org/linux-iommu/20210129092120.1482-1-thunder.leizhen@xxxxxxxxxx/

Signed-off-by: John Garry <john.garry@xxxxxxxxxx>
---
  drivers/iommu/dma-iommu.c |  2 +-
  drivers/iommu/iova.c      | 23 +++++++++++++++++------
  include/linux/iova.h      |  4 ++--
  3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 98ba927aee1a..4772278aa5da 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -434,7 +434,7 @@ static dma_addr_t iommu_dma_alloc_iova(struct iommu_domain *domain,
       * rounding up anything cacheable to make sure that can't happen. The
       * order of the unadjusted size will still match upon freeing.
       */
-    if (iova_len < (1 << (IOVA_RANGE_CACHE_MAX_SIZE - 1)))
+    if (iova_len < (1 << (iovad->rcache_max_size - 1)))
          iova_len = roundup_pow_of_two(iova_len);
      dma_limit = min_not_zero(dma_limit, dev->bus_dma_limit);
diff --git a/drivers/iommu/iova.c b/drivers/iommu/iova.c
index b6cf5f16123b..07ce73fdd8c1 100644
--- a/drivers/iommu/iova.c
+++ b/drivers/iommu/iova.c
@@ -15,6 +15,8 @@
  /* The anchor node sits above the top of the usable address space */
  #define IOVA_ANCHOR    ~0UL
+#define IOVA_RANGE_CACHE_MAX_SIZE 6    /* log of max cached IOVA range size (in pages) */

Is that the same as an 'order'? i.e. IOVA_RANGE_CACHE_MAX_ORDER?

Yeah, that may be better. I was just using the same name as before.


+
  static bool iova_rcache_insert(struct iova_domain *iovad,
                     unsigned long pfn,
                     unsigned long size);
@@ -881,7 +883,14 @@ static void init_iova_rcaches(struct iova_domain *iovad)
      unsigned int cpu;
      int i;
-    for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
+    iovad->rcache_max_size = IOVA_RANGE_CACHE_MAX_SIZE;
+
+    iovad->rcaches = kcalloc(iovad->rcache_max_size,
+                 sizeof(*iovad->rcaches), GFP_KERNEL);
+    if (!iovad->rcaches)
+        return;

Returning quietly here doesn't seem like the right thing to do. At least, I
don't think the rest of the functions here are checking rcaches against
NULL.


For sure, but that is what other code which can fail here already does, like:

static void init_iova_rcaches(struct iova_domain *iovad)
{
    ...

    for (i = 0; i < IOVA_RANGE_CACHE_MAX_SIZE; ++i) {
        ...

        rcache->cpu_rcaches = __alloc_percpu(sizeof(*cpu_rcache), cache_line_size());
        if (WARN_ON(!rcache->cpu_rcaches))
            continue;
}

and that is not safe either.

Yeah, along with flush queues, historically this has all been super-dodgy in terms of failure handling (or lack of).

This issue was raised a while ago. I don't mind trying to fix it - a slightly painful part is that it touches a few subsystems.

Maybe pull the rcache init out of iova_domain_init() entirely? Only iommu-dma uses {alloc,free}_iova_fast(), so TBH it's only a great big waste of memory for all the other IOVA domain users anyway.

The other week I started pondering how much of iommu-dma only needs to be exposed to the IOMMU core rather than the whole kernel now; I suppose there's probably an equal argument to be made for some of these bits of the IOVA API, and this might pave the way towards some more logical separation, but let's get the functional side dealt with before we worry too much about splitting headers.

Robin.