[PATCH v2 1/3] powerpc/mm/hash: Avoid resizing-down HPT on first memory hotplug
From: Leonardo Bras
Date: Fri Apr 30 2021 - 10:36:48 EST
Because hypervisors may need to create HPTs without knowing the guest
page size, the smallest used page-size (4k) may be chosen, resulting in
a HPT that is possibly bigger than needed.
On a guest with bigger page-sizes, the amount of entries for HTP may be
too high, causing the guest to ask for a HPT resize-down on the first
hotplug.
This becomes a problem when HPT resize-down fails, and causes the
HPT resize to be performed on every LMB added, until HPT size is
compatible to guest memory size, causing a major slowdown.
So, avoiding HPT resizing-down on hot-add significantly improves memory
hotplug times.
As an example, hotplugging 256GB on a 129GB guest took 710s without this
patch, and 21s after applied.
Signed-off-by: Leonardo Bras <leobras.c@xxxxxxxxx>
---
arch/powerpc/mm/book3s64/hash_utils.c | 36 ++++++++++++++++-----------
1 file changed, 21 insertions(+), 15 deletions(-)
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 581b20a2feaf..608e4ed397a9 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -795,7 +795,7 @@ static unsigned long __init htab_get_table_size(void)
}
#ifdef CONFIG_MEMORY_HOTPLUG
-static int resize_hpt_for_hotplug(unsigned long new_mem_size)
+static int resize_hpt_for_hotplug(unsigned long new_mem_size, bool shrinking)
{
unsigned target_hpt_shift;
@@ -804,19 +804,25 @@ static int resize_hpt_for_hotplug(unsigned long new_mem_size)
target_hpt_shift = htab_shift_for_mem_size(new_mem_size);
- /*
- * To avoid lots of HPT resizes if memory size is fluctuating
- * across a boundary, we deliberately have some hysterisis
- * here: we immediately increase the HPT size if the target
- * shift exceeds the current shift, but we won't attempt to
- * reduce unless the target shift is at least 2 below the
- * current shift
- */
- if (target_hpt_shift > ppc64_pft_size ||
- target_hpt_shift < ppc64_pft_size - 1)
- return mmu_hash_ops.resize_hpt(target_hpt_shift);
+ if (shrinking) {
- return 0;
+ /*
+ * To avoid lots of HPT resizes if memory size is fluctuating
+ * across a boundary, we deliberately have some hysterisis
+ * here: we immediately increase the HPT size if the target
+ * shift exceeds the current shift, but we won't attempt to
+ * reduce unless the target shift is at least 2 below the
+ * current shift
+ */
+
+ if (target_hpt_shift >= ppc64_pft_size - 1)
+ return 0;
+
+ } else if (target_hpt_shift <= ppc64_pft_size) {
+ return 0;
+ }
+
+ return mmu_hash_ops.resize_hpt(target_hpt_shift);
}
int hash__create_section_mapping(unsigned long start, unsigned long end,
@@ -829,7 +835,7 @@ int hash__create_section_mapping(unsigned long start, unsigned long end,
return -1;
}
- resize_hpt_for_hotplug(memblock_phys_mem_size());
+ resize_hpt_for_hotplug(memblock_phys_mem_size(), false);
rc = htab_bolt_mapping(start, end, __pa(start),
pgprot_val(prot), mmu_linear_psize,
@@ -848,7 +854,7 @@ int hash__remove_section_mapping(unsigned long start, unsigned long end)
int rc = htab_remove_mapping(start, end, mmu_linear_psize,
mmu_kernel_ssize);
- if (resize_hpt_for_hotplug(memblock_phys_mem_size()) == -ENOSPC)
+ if (resize_hpt_for_hotplug(memblock_phys_mem_size(), true) == -ENOSPC)
pr_warn("Hash collision while resizing HPT\n");
return rc;
--
2.30.2