Re: [PATCH v1 1/2] mm: let pte_lockptr() consume a pte_t pointer
From: David Hildenbrand
Date: Mon Jul 29 2024 - 04:46:22 EST
On 29.07.24 09:48, Qi Zheng wrote:
On 2024/7/26 02:39, David Hildenbrand wrote:
pte_lockptr() is the only *_lockptr() function that doesn't consume
what would be expected: it consumes a pmd_t pointer instead of a pte_t
pointer.
Let's change that. The two callers in pgtable-generic.c are easily
adjusted. Adjust khugepaged.c:retract_page_tables() to simply do a
pte_offset_map_nolock() to obtain the lock, even though we won't actually
be traversing the page table.
This makes the code more similar to the other variants and avoids other
hacks to make the new pte_lockptr() version happy. pte_lockptr() users
reside now only in pgtable-generic.c.
Maybe, using pte_offset_map_nolock() is the right thing to do because
the PTE table could have been removed in the meantime? At least it sounds
more future proof if we ever have other means of page table reclaim.
Agree, this helps us recheck the pmd entry.
It's not quite clear if holding the PTE table lock is really required:
what if someone else obtains the lock just after we unlock it? But we'll
leave that as is for now, maybe there are good reasons.
This is a preparation for adapting hugetlb page table locking logic to
take the same locks as core-mm page table walkers would.
Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
---
include/linux/mm.h | 7 ++++---
mm/khugepaged.c | 21 +++++++++++++++------
mm/pgtable-generic.c | 4 ++--
3 files changed, 21 insertions(+), 11 deletions(-)
Since pte_lockptr() no longer has a pmd parameter, it is best to modify
the comments above __pte_offset_map_lock() as well:
```
This helps the caller to avoid a later pte_lockptr(mm, *pmd), which
might by that time act on a changed *pmd ...
```
Right, thanks a lot for the review!
The following on top;
From a46b16aa9bfa02ffb425d364d7f00129a8e803ad Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david@xxxxxxxxxx>
Date: Mon, 29 Jul 2024 10:43:34 +0200
Subject: [PATCH] fixup: mm: let pte_lockptr() consume a pte_t pointer
Let's adjust the comment, passing a pte to pte_lockptr() and dropping
a detail about changed *pmd, which no longer applies.
Signed-off-by: David Hildenbrand <david@xxxxxxxxxx>
---
mm/pgtable-generic.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index 13a7705df3f87..f17465b43d344 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -350,11 +350,11 @@ pte_t *pte_offset_map_nolock(struct mm_struct *mm, pmd_t *pmd,
* pte_offset_map_nolock(mm, pmd, addr, ptlp), above, is like pte_offset_map();
* but when successful, it also outputs a pointer to the spinlock in ptlp - as
* pte_offset_map_lock() does, but in this case without locking it. This helps
- * the caller to avoid a later pte_lockptr(mm, *pmd), which might by that time
- * act on a changed *pmd: pte_offset_map_nolock() provides the correct spinlock
- * pointer for the page table that it returns. In principle, the caller should
- * recheck *pmd once the lock is taken; in practice, no callsite needs that -
- * either the mmap_lock for write, or pte_same() check on contents, is enough.
+ * the caller to avoid a later pte_lockptr(mm, pte): pte_offset_map_nolock()
+ * provides the correct spinlock pointer for the page table that it returns.
+ * In principle, the caller should recheck *pmd once the lock is taken; in
+ * practice, no callsite needs that - either the mmap_lock for write, or
+ * pte_same() check on contents, is enough.
*
* Note that free_pgtables(), used after unmapping detached vmas, or when
* exiting the whole mm, does not take page table lock before freeing a page
--
2.45.2
--
Cheers,
David / dhildenb