[PATCH v3 0/3] arm64: tlb: Fix TLBI RANGE operand

From: Gavin Shan
Date: Thu Apr 04 2024 - 23:59:32 EST


A kernel crash on the destination VM after the live migration was
reported by Yihuang Yu. The issue is only reproducible on NVidia's
grace-hopper where TLBI RANGE feature is available. The kernel crash
is caused by incomplete TLB flush and missed dirty page. For the
root cause and analysis, please refer to PATCH[v3 1/3]'s commit log.

Thanks to Marc Zyngier who proposed all the code changes.

PATCH[1] fixes the kernel crash by extending __TLBI_RANGE_NUM() so that
the TLBI RANGE on the area with MAX_TLBI_RANGE_PAGES pages can
be supported
PATCH[2] improves __TLBI_VADDR_RANGE() with masks and FIELD_PREP()
PATCH[3] allows TLBI RANGE operation on the area with MAX_TLBI_RANGE_PAGES
pages in __flush_tlb_range_nosync()

v2: https://lists.infradead.org/pipermail/linux-arm-kernel/2024-April/917432.html
v1: https://lists.infradead.org/pipermail/linux-arm-kernel/2024-April/916972.html

Changelog
=========
v3:
Improve __TLBI_RANGE_NUM() and its comments. Added patches
to improve __TLBI_VADDR_RANGE() and __flush_tlb_range_nosync() (Marc)
v2:
Improve __TLBI_RANGE_NUM() (Marc)

Gavin Shan (3):
arm64: tlb: Fix TLBI RANGE operand
arm64: tlb: Improve __TLBI_VADDR_RANGE()
arm64: tlb: Allow range operation for MAX_TLBI_RANGE_PAGES

arch/arm64/include/asm/tlbflush.h | 53 ++++++++++++++++++-------------
1 file changed, 31 insertions(+), 22 deletions(-)

--
2.44.0