Re: [PATCH v3 11/13] arm64: mm: More flags for __flush_tlb_range()

From: Jonathan Cameron

Date: Tue Mar 03 2026 - 12:52:51 EST


On Tue, 3 Mar 2026 13:54:33 +0000
Ryan Roberts <ryan.roberts@xxxxxxx> wrote:

> On 03/03/2026 09:57, Jonathan Cameron wrote:
> > On Mon, 2 Mar 2026 13:55:58 +0000
> > Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
> >
> >> Refactor function variants with "_nosync", "_local" and "_nonotify" into
> >> a single __always_inline implementation that takes flags and rely on
> >> constant folding to select the parts that are actually needed at any
> >> given callsite, based on the provided flags.
> >>
> >> Flags all live in the tlbf_t (TLB flags) type; TLBF_NONE (0) continues
> >> to provide the strongest semantics (i.e. evict from walk cache,
> >> broadcast, synchronise and notify). Each flag reduces the strength in
> >> some way; TLBF_NONOTIFY, TLBF_NOSYNC and TLBF_NOBROADCAST are added to
> >> complement the existing TLBF_NOWALKCACHE.
> >>
> >> There are no users that require TLBF_NOBROADCAST without
> >> TLBF_NOWALKCACHE so implement that as BUILD_BUG() to avoid needing to
> >> introduce dead code for vae1 invalidations.
> >>
> >> The result is a clearer, simpler, more powerful API.
> > Hi Ryan,
> >
> > There is one subtle change to rounding that should be called out at least.
>
> Thanks for the review. I'm confident that there isn't actually a change to the
> rounding here, but the responsibility has moved to the caller. See below...
>
> >
> > Might even be worth pulling it to a precursor patch where you can add an
> > explanation of why original code was rounding to a larger value than was
> > ever needed.
> >
> > Jonathan
> >
> >
> >>
> >> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
> >
> >
> >> static inline void __flush_tlb_range(struct vm_area_struct *vma,
> >> @@ -586,24 +615,9 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
> >> unsigned long stride, int tlb_level,
> >> tlbf_t flags)
> >> {
> >> - __flush_tlb_range_nosync(vma->vm_mm, start, end, stride,
> >> - tlb_level, flags);
> >> - __tlbi_sync_s1ish();
> >> -}
> >> -
> >> -static inline void local_flush_tlb_contpte(struct vm_area_struct *vma,
> >> - unsigned long addr)
> >> -{
> >> - unsigned long asid;
> >> -
> >> - addr = round_down(addr, CONT_PTE_SIZE);
> > See below.
> >> -
> >> - dsb(nshst);
> >> - asid = ASID(vma->vm_mm);
> >> - __flush_s1_tlb_range_op(vale1, addr, CONT_PTES, PAGE_SIZE, asid, 3);
> >> - mmu_notifier_arch_invalidate_secondary_tlbs(vma->vm_mm, addr,
> >> - addr + CONT_PTE_SIZE);
> >> - dsb(nsh);
> >> + start = round_down(start, stride);
> > See below.
> >> + end = round_up(end, stride);
> >> + __do_flush_tlb_range(vma, start, end, stride, tlb_level, flags);
> >> }
> >
> >>
> >> static inline bool __pte_flags_need_flush(ptdesc_t oldval, ptdesc_t newval)
> >> diff --git a/arch/arm64/mm/contpte.c b/arch/arm64/mm/contpte.c
> >> index 681f22fac52a1..3f1a3e86353de 100644
> >> --- a/arch/arm64/mm/contpte.c
> >> +++ b/arch/arm64/mm/contpte.c
> > ...
> >
> >> @@ -641,7 +641,10 @@ int contpte_ptep_set_access_flags(struct vm_area_struct *vma,
> >> __ptep_set_access_flags(vma, addr, ptep, entry, 0);
> >>
> >> if (dirty)
> >> - local_flush_tlb_contpte(vma, start_addr);
> >> + __flush_tlb_range(vma, start_addr,
> >> + start_addr + CONT_PTE_SIZE,
> >> + PAGE_SIZE, 3,
> >
> > This results in a different stride to round down.
> > local_flush_tlb_contpte() did
> > addr = round_down(addr, CONT_PTE_SIZE);
> >
> > With this call we have
> > start = round_down(start, stride); where stride is PAGE_SIZE.
> >
> > I'm too lazy to figure out if that matters.
>
> contpte_ptep_set_access_flags() is operating on a contpte block of ptes, and as
> such, start_addr has already been rounded down to the start of the block, which
> is always bigger than (and perfectly divisible by) PAGE_SIZE.
>
> Previously, local_flush_tlb_contpte() allowed passing any VA in within the
> contpte block and the function would automatically round it down to the start of
> the block and invalidate the full block.
>
> After the change, we are explicitly passing the already aligned block;
> start_addr is already guaranteed to be at the start of the block and "start_addr
> + CONT_PTE_SIZE" is the end.
>
> So in both cases, the rounding down that is done by local_flush_tlb_contpte() /
> __flush_tlb_range() doesn't actually change the value.

Ah ok, so key is that the round down in local_flush_tlb_contpte() never
did anything in practice because the only caller is
contpte_ptep_set_access_flags() and that does the align down a couple of
lines before the call. I should have spent a few seconds looking! :(

Maybe if you are respinning just throw in a one line comment on this in the commit
description.

Reviewed-by: Jonathan Cameron <jonathan.cameron@xxxxxxxxxx>

>
> Thanks,
> Ryan
>
>
> >
> >
> >> + TLBF_NOWALKCACHE | TLBF_NOBROADCAST);
> >> } else {
> >> __contpte_try_unfold(vma->vm_mm, addr, ptep, orig_pte);
> >> __ptep_set_access_flags(vma, addr, ptep, entry, dirty);
> >
>
>