Re: [PATCH v2] arm64/mm: pmd_mkinvalid() must handle swap pmds

From: Will Deacon
Date: Tue Apr 30 2024 - 09:55:48 EST


On Tue, Apr 30, 2024 at 02:31:38PM +0100, Ryan Roberts wrote:
> __split_huge_pmd_locked() can be called for a present THP, devmap or
> (non-present) migration entry. It calls pmdp_invalidate()
> unconditionally on the pmdp and only determines if it is present or not
> based on the returned old pmd.
>
> But arm64's pmd_mkinvalid(), called by pmdp_invalidate(),
> unconditionally sets the PMD_PRESENT_INVALID flag, which causes future
> pmd_present() calls to return true - even for a swap pmd. Therefore any
> lockless pgtable walker could see the migration entry pmd in this state
> and start interpretting the fields (e.g. pmd_pfn()) as if it were
> present, leading to BadThings (TM). GUP-fast appears to be one such
> lockless pgtable walker.
>
> While the obvious fix is for core-mm to avoid such calls for non-present
> pmds (pmdp_invalidate() will also issue TLBI which is not necessary for
> this case either), all other arches that implement pmd_mkinvalid() do it
> in such a way that it is robust to being called with a non-present pmd.
> So it is simpler and safer to make arm64 robust too. This approach means
> we can even add tests to debug_vm_pgtable.c to validate the required
> behaviour.
>
> This is a theoretical bug found during code review. I don't have any
> test case to trigger it in practice.
>
> Cc: stable@xxxxxxxxxxxxxxx
> Fixes: 53fa117bb33c ("arm64/mm: Enable THP migration")
> Signed-off-by: Ryan Roberts <ryan.roberts@xxxxxxx>
> ---
>
> Hi all,
>
> v1 of this fix [1] took the approach of fixing core-mm to never call
> pmdp_invalidate() on a non-present pmd. But Zi Yan highlighted that only arm64
> suffers this problem; all other arches are robust. So his suggestion was to
> instead make arm64 robust in the same way and add tests to validate it. Despite
> my stated reservations in the context of the v1 discussion, having thought on it
> for a bit, I now agree with Zi Yan. Hence this post.
>
> Andrew has v1 in mm-unstable at the moment, so probably the best thing to do is
> remove it from there and have this go in through the arm64 tree? Assuming there
> is agreement that this approach is right one.
>
> This applies on top of v6.9-rc5. Passes all the mm selftests on arm64.
>
> [1] https://lore.kernel.org/linux-mm/20240425170704.3379492-1-ryan.roberts@xxxxxxx/
>
> Thanks,
> Ryan
>
>
> arch/arm64/include/asm/pgtable.h | 12 +++++--
> mm/debug_vm_pgtable.c | 61 ++++++++++++++++++++++++++++++++
> 2 files changed, 71 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index afdd56d26ad7..7d580271a46d 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -511,8 +511,16 @@ static inline int pmd_trans_huge(pmd_t pmd)
>
> static inline pmd_t pmd_mkinvalid(pmd_t pmd)
> {
> - pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID));
> - pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID));
> + /*
> + * If not valid then either we are already present-invalid or we are
> + * not-present (i.e. none or swap entry). We must not convert
> + * not-present to present-invalid. Unbelievably, the core-mm may call
> + * pmd_mkinvalid() for a swap entry and all other arches can handle it.
> + */
> + if (pmd_valid(pmd)) {
> + pmd = set_pmd_bit(pmd, __pgprot(PMD_PRESENT_INVALID));
> + pmd = clear_pmd_bit(pmd, __pgprot(PMD_SECT_VALID));
> + }
>
> return pmd;
> }

Acked-by: Will Deacon <will@xxxxxxxxxx>

But it might be worth splitting the tests from the fix to make backporting
easier.

Catalin -- I assume you'll pick this up, but please shout if you want me
to take it instead.

Will