Re: [PATCH v3 02/16] mm: introduce leaf entry type and use to simplify leaf entry logic
From: Zi Yan
Date: Mon Nov 10 2025 - 22:26:06 EST
On 10 Nov 2025, at 17:21, Lorenzo Stoakes wrote:
> The kernel maintains leaf page table entries which contain either:
>
> - Nothing ('none' entries)
> - Present entries (that is stuff the hardware can navigate without fault)
This is not true for:
1. pXX_protnone(), where _PAGE_PROTNONE flag also means pXX_present() is
true, but hardware would still trigger a fault.
2. pmd_present() where _PAGE_PSE also means a present PMD (see the comment
in pmd_present()).
This commit log needs to be updated.
> - Everything else that will cause a fault which the kernel handles
This is not true because of the reasons above.
How should we categorize these non-present to HW but present to SW entries,
like protnone and under splitting PMDs? Strictly speaking, they are
softleaf entries, but that would require more changes to the kernel code
and pXX_present() means HW present.
To not make this series more complicated, I think updating commit log
and comments to use pXX_present() instead of HW present might be
the easiest way out. We can revisit pXX_present() vs HW present later.
OK, I will focus on code review now.
>
> In the 'everything else' group we include swap entries, but we also include
> a number of other things such as migration entries, device private entries
> and marker entries.
>
> Unfortunately this 'everything else' group expresses everything through
> a swp_entry_t type, and these entries are referred to swap entries even
> though they may well not contain a... swap entry.
>
> This is compounded by the rather mind-boggling concept of a non-swap swap
> entry (checked via non_swap_entry()) and the means by which we twist and
> turn to satisfy this.
>
> This patch lays the foundation for reducing this confusion.
>
> We refer to 'everything else' as a 'software-define leaf entry' or
> 'softleaf'. for short And in fact we scoop up the 'none' entries into this
> concept also so we are left with:
>
> - Present entries.
> - Softleaf entries (which may be empty).
>
> This allows for radical simplification across the board - one can simply
> convert any leaf page table entry to a leaf entry via softleaf_from_pte().
>
> If the entry is present, we return an empty leaf entry, so it is assumed
> the caller is aware that they must differentiate between the two categories
> of page table entries, checking for the former via pte_present().
>
> As a result, we can eliminate a number of places where we would otherwise
> need to use predicates to see if we can proceed with leaf page table entry
> conversion and instead just go ahead and do it unconditionally.
>
> We do so where we can, adjusting surrounding logic as necessary to
> integrate the new softleaf_t logic as far as seems reasonable at this
> stage.
>
> We typedef swp_entry_t to softleaf_t for the time being until the
> conversion can be complete, meaning everything remains compatible
> regardless of which type is used. We will eventually remove swp_entry_t
> when the conversion is complete.
>
> We introduce a new header file to keep things clear - leafops.h - this
> imports swapops.h so can direct replace swapops imports without issue, and
> we do so in all the files that require it.
>
> Additionally, add new leafops.h file to core mm maintainers entry.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@xxxxxxxxxx>
> ---
> MAINTAINERS | 1 +
> fs/proc/task_mmu.c | 26 +--
> fs/userfaultfd.c | 6 +-
> include/linux/leafops.h | 387 ++++++++++++++++++++++++++++++++++
> include/linux/mm_inline.h | 6 +-
> include/linux/mm_types.h | 25 +++
> include/linux/swapops.h | 28 ---
> include/linux/userfaultfd_k.h | 51 +----
> mm/hmm.c | 2 +-
> mm/hugetlb.c | 37 ++--
> mm/madvise.c | 16 +-
> mm/memory.c | 41 ++--
> mm/mincore.c | 6 +-
> mm/mprotect.c | 6 +-
> mm/mremap.c | 4 +-
> mm/page_vma_mapped.c | 11 +-
> mm/shmem.c | 7 +-
> mm/userfaultfd.c | 6 +-
> 18 files changed, 502 insertions(+), 164 deletions(-)
> create mode 100644 include/linux/leafops.h
>
Best Regards,
Yan, Zi