Re: [PATCH v3 13/13] arm64: mm: Provide level hint for flush_tlb_page()
From: Ryan Roberts
Date: Mon Mar 02 2026 - 12:43:22 EST
On 02/03/2026 14:42, Mark Rutland wrote:
> Hi Ryan,
>
> On Mon, Mar 02, 2026 at 01:56:00PM +0000, Ryan Roberts wrote:
>> Previously tlb invalidations issued by __flush_tlb_page() did not
>> contain a level hint. But the function is clearly only ever targeting
>> level 3 tlb entries and its documentation agrees:
>>
>> | this operation only invalidates a single, last-level page-table
>> | entry and therefore does not affect any walk-caches
>
> FWIW, I'd have read "last-level" as synonymous with "leaf" (i.e. a Page
> or Block entry, which is the last level of walk) rather than level 3
> specifically. The architecture uses the term to match the former (e.g.
> in the description of TLBI VALE1IS).
Hmm yeah, now that I'm re-reading, I agree that quoted documentation doesn't say
anything about it being level 3 specific.
But actually that was arm64-specific documentation for flush_tlb_page(), which
is a core-mm function. The generic docs at Documentation/core-api/cachetlb.rst
make it clear that it's intended only for PTE invalidations, I think:
| 4) ``void flush_tlb_page(struct vm_area_struct *vma, unsigned long addr)``
|
| This time we need to remove the PAGE_SIZE sized translation
| from the TLB. ...
>
> If we're tightening up __flush_tlb_page(), I think it'd be worth either
> updating the comment to explicitly note that this only applies to level
> 3 entries, OR update the comment+name to say it applies to leaf entries,
> and have it take a level parameter.
I'll fix the arm64-specific docs to align with the generic docs and replace
"last-level" with "level 3" if that works for you.
>
>> However, it turns out that the function was actually being used to
>> invalidate a level 2 mapping via flush_tlb_fix_spurious_fault_pmd().
>> The bug was benign because the level hint was not set so the HW would
>> still invalidate the PMD mapping, and also because the TLBF_NONOTIFY
>> flag was set, the bounds of the mapping were never used for anything
>> else.
>
> I suspect (as above) that the current usage was intentional, legitimate
> usage, just poorly documented.
Before this series flush_tlb_fix_spurious_fault_pmd() was implemented using
local_flush_tlb_page_nonotify() which never even gives an option to set the TTL
hint, so I agree.
But I don't think flush_tlb_fix_spurious_fault_pmd() should be calling any tlb
flush api that has "page" in the name since that implies PTE, not PMD.
I think what I have done is an improvement; but I'm happy to soften/correct this
description in the next version.
>
>> Now that we have the new and improved range-invalidation API, it is
>> trival to fix flush_tlb_fix_spurious_fault_pmd() to explicitly flush the
>> whole range (locally, without notification and last level only). So
>> let's do that, and then update __flush_tlb_page() to hint level 3.
>
> Do we never use __flush_tlb_page() to manipulate a level 1 block
> mapping? I'd have expected we did the same lazy invalidation for
> permission relazation there, but if that's not the case, then this seems
> fine in principle.
No, there is no flush_tlb_fix_spurious_fault_pud().
(flush_tlb_fix_spurious_fault_pmd() was only added last cycle).
>
>> Reviewed-by: Linu Cherian <linu.cherian@xxxxxxx>
>> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
>> ---
>> arch/arm64/include/asm/pgtable.h | 5 +++--
>> arch/arm64/include/asm/tlbflush.h | 2 +-
>> 2 files changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
>> index 7039931df4622..b1a96a8f2b17e 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -103,8 +103,9 @@ static inline void arch_leave_lazy_mmu_mode(void)
>> #define flush_tlb_fix_spurious_fault(vma, address, ptep) \
>> __flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
>>
>> -#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
>> - __flush_tlb_page(vma, address, TLBF_NOBROADCAST | TLBF_NONOTIFY)
>> +#define flush_tlb_fix_spurious_fault_pmd(vma, address, pmdp) \
>> + __flush_tlb_range(vma, address, address + PMD_SIZE, PMD_SIZE, 2, \
>> + TLBF_NOBROADCAST | TLBF_NONOTIFY | TLBF_NOWALKCACHE)
>
> Is there a reason to keep __flush_tlb_page(), rather than defining
> flush_tlb_fix_spurious_fault() in terms of __flush_tlb_range() with all
> the level 3 constants?
__flush_tlb_page() is called by __ptep_clear_flush_young() and
__ptep_set_access_flags() (as well as by flush_tlb_page()). I could replace them
all, but __flush_tlb_page() is a bit less verbose... I have no strong preference.
Thanks,
Ryan
>
> Mark.
>
>>
>> /*
>> * ZERO_PAGE is a global shared page that is always zero: used
>> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
>> index 5096ec7ab8650..958fe97b744e5 100644
>> --- a/arch/arm64/include/asm/tlbflush.h
>> +++ b/arch/arm64/include/asm/tlbflush.h
>> @@ -591,7 +591,7 @@ static inline void __flush_tlb_page(struct vm_area_struct *vma,
>> unsigned long start = round_down(uaddr, PAGE_SIZE);
>> unsigned long end = start + PAGE_SIZE;
>>
>> - __do_flush_tlb_range(vma, start, end, PAGE_SIZE, TLBI_TTL_UNKNOWN,
>> + __do_flush_tlb_range(vma, start, end, PAGE_SIZE, 3,
>> TLBF_NOWALKCACHE | flags);
>> }
>>
>> --
>> 2.43.0
>>