Re: [PATCH v2 2/5] mm: avoid unnecessary flush on change_huge_pmd()

From: Peter Zijlstra
Date: Mon Oct 25 2021 - 06:52:41 EST


On Thu, Oct 21, 2021 at 05:21:09AM -0700, Nadav Amit wrote:
> diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtable.h
> index 448cd01eb3ec..18c3366f8f4d 100644
> --- a/arch/x86/include/asm/pgtable.h
> +++ b/arch/x86/include/asm/pgtable.h
> @@ -1146,6 +1146,14 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma,
> }
> }
> #endif
> +
> +#define __HAVE_ARCH_PMDP_INVALIDATE_AD
> +static inline pmd_t pmdp_invalidate_ad(struct vm_area_struct *vma,
> + unsigned long address, pmd_t *pmdp)
> +{
> + return pmdp_establish(vma, address, pmdp, pmd_mkinvalid(*pmdp));

Did this want to be something like:

pmd_t old = pmdp_establish(vma, address, pmdp, pmd_mkinvalid(*pmdp));
if (cpu_feature_enabled(X86_BUG_PTE_LEAK))
flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
return old;

instead?

> +}
> +
> /*
> * Page table pages are page-aligned. The lower half of the top
> * level is used for userspace and the top half for the kernel.

> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index e5ea5f775d5c..435da011b1a2 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1795,10 +1795,11 @@ int change_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,
> * The race makes MADV_DONTNEED miss the huge pmd and don't clear it
> * which may break userspace.
> *
> - * pmdp_invalidate() is required to make sure we don't miss
> - * dirty/young flags set by hardware.
> + * pmdp_invalidate_ad() is required to make sure we don't miss
> + * dirty/young flags (which are also known as access/dirty) cannot be
> + * further modifeid by the hardware.

"modified", I think is the more common spelling.

> */
> - entry = pmdp_invalidate(vma, addr, pmd);
> + entry = pmdp_invalidate_ad(vma, addr, pmd);
>
> entry = pmd_modify(entry, newprot);
> if (preserve_write)
> diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
> index 4e640baf9794..b0ce6c7391bf 100644
> --- a/mm/pgtable-generic.c
> +++ b/mm/pgtable-generic.c
> @@ -200,6 +200,14 @@ pmd_t pmdp_invalidate(struct vm_area_struct *vma, unsigned long address,
> }
> #endif
>
> +#ifndef __HAVE_ARCH_PMDP_INVALIDATE_AD

/*
* Does this deserve a comment to explain the intended difference vs
* pmdp_invalidate() ?
*/

> +pmd_t pmdp_invalidate_ad(struct vm_area_struct *vma, unsigned long address,
> + pmd_t *pmdp)
> +{
> + return pmdp_invalidate(vma, address, pmdp);
> +}
> +#endif
> +
> #ifndef pmdp_collapse_flush
> pmd_t pmdp_collapse_flush(struct vm_area_struct *vma, unsigned long address,
> pmd_t *pmdp)
> --
> 2.25.1
>