Re: [PATCH v2 3/5] mm: swap_pte_batch: add an output argument to reture if all swap entries are exclusive

From: David Hildenbrand
Date: Thu Apr 11 2024 - 11:00:37 EST


On 11.04.24 16:54, Ryan Roberts wrote:
On 09/04/2024 09:26, Barry Song wrote:
From: Barry Song <v-songbaohua@xxxxxxxx>

Add a boolean argument named any_shared. If any of the swap entries are
non-exclusive, set any_shared to true. The function do_swap_page() can
then utilize this information to determine whether the entire large
folio can be reused.

Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx>
---
mm/internal.h | 9 ++++++++-
mm/madvise.c | 2 +-
mm/memory.c | 2 +-
3 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index 9d3250b4a08a..cae39c372bfc 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -238,7 +238,8 @@ static inline pte_t pte_next_swp_offset(pte_t pte)
*
* Return: the number of table entries in the batch.
*/
-static inline int swap_pte_batch(pte_t *start_ptep, int max_nr, pte_t pte)
+static inline int swap_pte_batch(pte_t *start_ptep, int max_nr, pte_t pte,
+ bool *any_shared)

Please update the docs in the comment above this for the new param; follow
folio_pte_batch()'s docs as a template.

{
pte_t expected_pte = pte_next_swp_offset(pte);
const pte_t *end_ptep = start_ptep + max_nr;
@@ -248,12 +249,18 @@ static inline int swap_pte_batch(pte_t *start_ptep, int max_nr, pte_t pte)
VM_WARN_ON(!is_swap_pte(pte));
VM_WARN_ON(non_swap_entry(pte_to_swp_entry(pte)));
+ if (any_shared)
+ *any_shared |= !pte_swp_exclusive(pte);

This is different from the approach in folio_pte_batch(). It inits *any_shared
to false and does NOT include the value of the first pte. I think that's odd,
personally and I prefer your approach. I'm not sure if there was a good reason
that David chose the other approach?

Because in my case calling code does

nr = folio_pte_batch(folio, addr, src_pte, pte, max_nr, flags,
&any_writable);

..

if (any_writable)
pte = pte_mkwrite(pte, src_vma);

..

and later checks in another function pte_write().

So if the common pattern is that the original PTE will be used for checks, then it doesn't make sense to unnecessary checks+setting for the first PTE.

--
Cheers,

David / dhildenb