Re: [RFC PATCH 1/2] mm: make lazy MMU mode context-aware

From: David Hildenbrand (Arm)

Date: Wed Mar 25 2026 - 05:55:36 EST

On 3/25/26 08:41, Alexander Gordeev wrote:
> Lazy MMU mode is assumed to be context-independent, in the sense
> that it does not need any additional information while operating.
> However, the s390 architecture benefits from knowing the exact
> page table entries being modified.
>
> Introduce lazy_mmu_mode_enable_pte(), which is provided with the
> process address space and the page table being operated on. This
> information is required to enable s390-specific optimizations.
>
> The function takes parameters that are typically passed to page-
> table level walkers, which implies that the span of PTE entries
> never crosses a page table boundary.
>
> Architectures that do not require such information simply do not
> need to define the arch_enter_lazy_mmu_mode_pte() callback.
>
> Signed-off-by: Alexander Gordeev <agordeev@xxxxxxxxxxxxx>
> ---
> fs/proc/task_mmu.c | 2 +-
> include/linux/pgtable.h | 42 +++++++++++++++++++++++++++++++++++++++++
> mm/madvise.c | 8 ++++----
> mm/memory.c | 8 ++++----
> mm/mprotect.c | 2 +-
> mm/mremap.c | 2 +-
> mm/vmalloc.c | 6 +++---
> 7 files changed, 56 insertions(+), 14 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index e091931d7ca1..4e3b1987874a 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -2752,7 +2752,7 @@ static int pagemap_scan_pmd_entry(pmd_t *pmd, unsigned long start,
> return 0;
> }
>
> - lazy_mmu_mode_enable();
> + lazy_mmu_mode_enable_pte(vma->vm_mm, start, end, start_pte);
>
> if ((p->arg.flags & PM_SCAN_WP_MATCHING) && !p->vec_out) {
> /* Fast path for performing exclusive WP */
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index a50df42a893f..481b45954800 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -271,6 +271,44 @@ static inline void lazy_mmu_mode_enable(void)
> arch_enter_lazy_mmu_mode();
> }
>
> +#ifndef arch_enter_lazy_mmu_mode_pte
> +static inline void arch_enter_lazy_mmu_mode_pte(struct mm_struct *mm,
> + unsigned long addr,
> + unsigned long end,
> + pte_t *ptep)

Two tab alignment please. (applies to other things hwere as well)

> +{
> + arch_enter_lazy_mmu_mode();
> +}
> +#endif
> +
> +/**
> + * lazy_mmu_mode_enable_pte() - Enable the lazy MMU mode with parameters

You have to be a lot clearer about implications. For example, what
happens if we would bail out and not process all ptes? What are the
exact semantics.

> + *
> + * Enters a new lazy MMU mode section; if the mode was not already enabled,
> + * enables it and calls arch_enter_lazy_mmu_mode_pte().
> + *
> + * Must be paired with a call to lazy_mmu_mode_disable().
> + *
> + * Has no effect if called:
> + * - While paused - see lazy_mmu_mode_pause()
> + * - In interrupt context
> + */
> +static inline void lazy_mmu_mode_enable_pte(struct mm_struct *mm,
> + unsigned long addr,
> + unsigned long end,
> + pte_t *ptep)

It can be multiple ptes, so should this be some kind of "pte_range"/

lazy_mmu_mode_enable_for_pte_range()

A bit mouthful but clearer.

> +{
> + struct lazy_mmu_state *state = &current->lazy_mmu_state;
> +
> + if (in_interrupt() || state->pause_count > 0)
> + return;
> +
> + VM_WARN_ON_ONCE(state->enable_count == U8_MAX);
> +
> + if (state->enable_count++ == 0)
> + arch_enter_lazy_mmu_mode_pte(mm, addr, end, ptep);
> +}

I'm wondering whether that could instead be some optional interface that
we trigger after the lazy_mmu_mode_enable. But looking at
lazy_mmu_mode_enable() users, there don't seem to be cases where we
would process multiple different ranges under a single enable() call, right?

--
Cheers,

David