On Mon, Sep 16, 2024 at 11:08 PM Dev Jain <dev.jain@xxxxxxx> wrote:
For an mTHP allocation, we need to check, for every order, whether forCould we include some benchmark data here, as suggested by Ryan in this thread?
that order, we have enough number of contiguous PTEs empty. Instead of
iterating the while loop for every order, use some information, which
is the first set PTE found, from the previous iteration to eliminate
some cases. The key to understanding the correctness of the patch
is that the ranges we want to examine form a strictly decreasing
sequence of nested intervals.
https://lore.kernel.org/linux-mm/58f91a56-890a-45d0-8b1f-47c4c70c9600@xxxxxxx/
Suggested-by: Ryan Roberts <ryan.roberts@xxxxxxx>Thanks
Signed-off-by: Dev Jain <dev.jain@xxxxxxx>
---
mm/memory.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index 8bb1236de93c..e81c6abe09ce 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -4633,10 +4633,11 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf)
{
struct vm_area_struct *vma = vmf->vma;
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+ pte_t *first_set_pte = NULL, *align_pte, *pte;
unsigned long orders;
struct folio *folio;
unsigned long addr;
- pte_t *pte;
+ int max_empty;
gfp_t gfp;
int order;
@@ -4671,8 +4672,23 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf)
order = highest_order(orders);
while (orders) {
addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order);
- if (pte_range_none(pte + pte_index(addr), 1 << order) == 1 << order)
+ align_pte = pte + pte_index(addr);
+
+ /* Range to be scanned known to be empty */
+ if (align_pte + (1 << order) <= first_set_pte)
+ break;
+
+ /* Range to be scanned contains first_set_pte */
+ if (align_pte <= first_set_pte)
+ goto repeat;
+
+ /* align_pte > first_set_pte, so need to check properly */
+ max_empty = pte_range_none(align_pte, 1 << order);
+ if (max_empty == 1 << order)
break;
+
+ first_set_pte = align_pte + max_empty;
+repeat:
order = next_order(&orders, order);
}
--
2.30.2
barry