Re: [PATCH v3 0/3] arm64: tlb: Fix TLBI RANGE operand

From: Shaoqin Huang
Date: Wed Apr 10 2024 - 04:45:11 EST




On 4/5/24 11:58, Gavin Shan wrote:
A kernel crash on the destination VM after the live migration was
reported by Yihuang Yu. The issue is only reproducible on NVidia's
grace-hopper where TLBI RANGE feature is available. The kernel crash
is caused by incomplete TLB flush and missed dirty page. For the
root cause and analysis, please refer to PATCH[v3 1/3]'s commit log.

Thanks to Marc Zyngier who proposed all the code changes.

PATCH[1] fixes the kernel crash by extending __TLBI_RANGE_NUM() so that
the TLBI RANGE on the area with MAX_TLBI_RANGE_PAGES pages can
be supported
PATCH[2] improves __TLBI_VADDR_RANGE() with masks and FIELD_PREP()
PATCH[3] allows TLBI RANGE operation on the area with MAX_TLBI_RANGE_PAGES
pages in __flush_tlb_range_nosync()

v2: https://lists.infradead.org/pipermail/linux-arm-kernel/2024-April/917432.html
v1: https://lists.infradead.org/pipermail/linux-arm-kernel/2024-April/916972.html

Changelog
=========
v3:
Improve __TLBI_RANGE_NUM() and its comments. Added patches
to improve __TLBI_VADDR_RANGE() and __flush_tlb_range_nosync() (Marc)
v2:
Improve __TLBI_RANGE_NUM() (Marc)

Gavin Shan (3):
arm64: tlb: Fix TLBI RANGE operand
arm64: tlb: Improve __TLBI_VADDR_RANGE()
arm64: tlb: Allow range operation for MAX_TLBI_RANGE_PAGES

arch/arm64/include/asm/tlbflush.h | 53 ++++++++++++++++++-------------
1 file changed, 31 insertions(+), 22 deletions(-)


For the series.

Reviewed-by: Shaoqin Huang <shahuang@xxxxxxxxxx>

--
Shaoqin