Re: [PATCH] [arm64/tlb] Fix mmu notifiers for range-based invalidates
From: Catalin Marinas
Date: Wed Mar 05 2025 - 13:48:37 EST
On Tue, Mar 04, 2025 at 12:51:27AM -0800, Piotr Jaroszynski wrote:
> Update the __flush_tlb_range_op macro not to modify its parameters as
> these are unexepcted semantics. In practice, this fixes the call to
> mmu_notifier_arch_invalidate_secondary_tlbs() in
> __flush_tlb_range_nosync() to use the correct range instead of an empty
> range with start=end. The empty range was (un)lucky as it results in
> taking the invalidate-all path that doesn't cause correctness issues,
> but can certainly result in suboptimal perf.
>
> This has been broken since commit 6bbd42e2df8f ("mmu_notifiers: call
> invalidate_range() when invalidating TLBs") when the call to the
> notifiers was added to __flush_tlb_range(). It predates the addition of
> the __flush_tlb_range_op() macro from commit 360839027a6e ("arm64: tlb:
> Refactor the core flush algorithm of __flush_tlb_range") that made the
> bug hard to spot.
That's the problem with macros.
Reviewed-by: Catalin Marinas <catalin.marinas@xxxxxxx>
Will, do you want to take this as a fix? It's only a performance
regression, though you never know how it breaks the callers of the macro
at some point.
> Fixes: 6bbd42e2df8f ("mmu_notifiers: call invalidate_range() when invalidating TLBs")
>
> Signed-off-by: Piotr Jaroszynski <pjaroszynski@xxxxxxxxxx>
> Cc: Catalin Marinas <catalin.marinas@xxxxxxx>
> Cc: Will Deacon <will@xxxxxxxxxx>
> Cc: Robin Murphy <robin.murphy@xxxxxxx>
> Cc: Alistair Popple <apopple@xxxxxxxxxx>
> Cc: Raghavendra Rao Ananta <rananta@xxxxxxxxxx>
> Cc: SeongJae Park <sj@xxxxxxxxxx>
> Cc: Jason Gunthorpe <jgg@xxxxxxxxxx>
> Cc: John Hubbard <jhubbard@xxxxxxxxxx>
> Cc: Nicolin Chen <nicolinc@xxxxxxxxxx>
> Cc: linux-arm-kernel@xxxxxxxxxxxxxxxxxxx
> Cc: iommu@xxxxxxxxxxxxxxx
> Cc: linux-mm@xxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: stable@xxxxxxxxxxxxxxx
> ---
> arch/arm64/include/asm/tlbflush.h | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
> index bc94e036a26b..8104aee4f9a0 100644
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -396,33 +396,35 @@ static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
> #define __flush_tlb_range_op(op, start, pages, stride, \
> asid, tlb_level, tlbi_user, lpa2) \
> do { \
> + typeof(start) __flush_start = start; \
> + typeof(pages) __flush_pages = pages; \
> int num = 0; \
> int scale = 3; \
> int shift = lpa2 ? 16 : PAGE_SHIFT; \
> unsigned long addr; \
> \
> - while (pages > 0) { \
> + while (__flush_pages > 0) { \
> if (!system_supports_tlb_range() || \
> - pages == 1 || \
> - (lpa2 && start != ALIGN(start, SZ_64K))) { \
> - addr = __TLBI_VADDR(start, asid); \
> + __flush_pages == 1 || \
> + (lpa2 && __flush_start != ALIGN(__flush_start, SZ_64K))) { \
> + addr = __TLBI_VADDR(__flush_start, asid); \
> __tlbi_level(op, addr, tlb_level); \
> if (tlbi_user) \
> __tlbi_user_level(op, addr, tlb_level); \
> - start += stride; \
> - pages -= stride >> PAGE_SHIFT; \
> + __flush_start += stride; \
> + __flush_pages -= stride >> PAGE_SHIFT; \
> continue; \
> } \
> \
> - num = __TLBI_RANGE_NUM(pages, scale); \
> + num = __TLBI_RANGE_NUM(__flush_pages, scale); \
> if (num >= 0) { \
> - addr = __TLBI_VADDR_RANGE(start >> shift, asid, \
> + addr = __TLBI_VADDR_RANGE(__flush_start >> shift, asid, \
> scale, num, tlb_level); \
> __tlbi(r##op, addr); \
> if (tlbi_user) \
> __tlbi_user(r##op, addr); \
> - start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
> - pages -= __TLBI_RANGE_PAGES(num, scale); \
> + __flush_start += __TLBI_RANGE_PAGES(num, scale) << PAGE_SHIFT; \
> + __flush_pages -= __TLBI_RANGE_PAGES(num, scale);\
> } \
> scale--; \
> } \
>
> base-commit: 99fa936e8e4f117d62f229003c9799686f74cebc
> --
> 2.22.1.7.gac84d6e93c.dirty
--
Catalin