Re: [PATCH RFC v2 11/18] mm: skip zeroing in vma_alloc_zeroed_movable_folio for pre-zeroed pages

From: David Hildenbrand (Arm)

Date: Tue Apr 21 2026 - 07:26:12 EST


On 4/21/26 13:12, David Hildenbrand (Arm) wrote:
> On 4/21/26 12:58, David Hildenbrand (Arm) wrote:
>> On 4/20/26 14:50, Michael S. Tsirkin wrote:
>>> Use vma_alloc_folio_hints() and check PGHINT_ZEROED to skip
>>> clear_user_highpage() when the page is already zeroed.
>>>
>>> On x86, vma_alloc_zeroed_movable_folio is overridden by a macro
>>> that uses __GFP_ZERO directly, so this change has no effect there.
>>>
>>> Signed-off-by: Michael S. Tsirkin <mst@xxxxxxxxxx>
>>> Assisted-by: Claude:claude-opus-4-6
>>> Assisted-by: cursor-agent:GPT-5.4-xhigh
>>> ---
>>> include/linux/highmem.h | 6 ++++--
>>> 1 file changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>>> index af03db851a1d..8bb67772c1cb 100644
>>> --- a/include/linux/highmem.h
>>> +++ b/include/linux/highmem.h
>>> @@ -321,9 +321,11 @@ struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>>> unsigned long vaddr)
>>> {
>>> struct folio *folio;
>>> + pghint_t hints;
>>>
>>> - folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vaddr);
>>> - if (folio && user_alloc_needs_zeroing())
>>> + folio = vma_alloc_folio_hints(GFP_HIGHUSER_MOVABLE, 0, vma, vaddr,
>>> + &hints);
>>> + if (folio && user_alloc_needs_zeroing() && !(hints & PGHINT_ZEROED))
>>> clear_user_highpage(&folio->page, vaddr);
>>>
>>> return folio;
>>
>>
>> For others reading along, the variant on your other branch:
>>
>> diff --git a/include/linux/highmem.h b/include/linux/highmem.h
>> index af03db851a1d9..ffa683f64f1d1 100644
>> --- a/include/linux/highmem.h
>> +++ b/include/linux/highmem.h
>> @@ -320,13 +320,8 @@ static inline
>> struct folio *vma_alloc_zeroed_movable_folio(struct vm_area_struct *vma,
>> unsigned long vaddr)
>> {
>> - struct folio *folio;
>> -
>> - folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, vma, vaddr);
>> - if (folio && user_alloc_needs_zeroing())
>> - clear_user_highpage(&folio->page, vaddr);
>> -
>> - return folio;
>> + return vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO,
>> + 0, vma, vaddr);
>> }
>> #endif
>>
>> Looks like an extremely clean interface.
>>
>
> And we could likely just do the following on top I assume.
>
> diff --git a/arch/m68k/include/asm/page_no.h b/arch/m68k/include/asm/page_no.h
> index d2532bc407ef..f511b763a235 100644
> --- a/arch/m68k/include/asm/page_no.h
> +++ b/arch/m68k/include/asm/page_no.h
> @@ -12,9 +12,6 @@ extern unsigned long memory_end;
>
> #define copy_user_page(to, from, vaddr, pg) copy_page(to, from)
>
> -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \
> - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr)
> -
> #define __pa(vaddr) ((unsigned long)(vaddr))
> #define __va(paddr) ((void *)((unsigned long)(paddr)))
>
> diff --git a/arch/s390/include/asm/page.h b/arch/s390/include/asm/page.h
> index 56da819a79e6..e995d2a413f9 100644
> --- a/arch/s390/include/asm/page.h
> +++ b/arch/s390/include/asm/page.h
> @@ -67,9 +67,6 @@ static inline void copy_page(void *to, void *from)
>
> #define copy_user_page(to, from, vaddr, pg) copy_page(to, from)
>
> -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \
> - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr)
> -
> #ifdef CONFIG_STRICT_MM_TYPECHECKS
> #define STRICT_MM_TYPECHECKS
> #endif
> diff --git a/arch/x86/include/asm/page.h b/arch/x86/include/asm/page.h
> index 416dc88e35c1..92fa975b46f3 100644
> --- a/arch/x86/include/asm/page.h
> +++ b/arch/x86/include/asm/page.h
> @@ -28,9 +28,6 @@ static inline void copy_user_page(void *to, void *from, unsigned long vaddr,
> copy_page(to, from);
> }
>
> -#define vma_alloc_zeroed_movable_folio(vma, vaddr) \
> - vma_alloc_folio(GFP_HIGHUSER_MOVABLE | __GFP_ZERO, 0, vma, vaddr)
> -
> #ifndef __pa
> #define __pa(x) __phys_addr((unsigned long)(x))
> #endif
>
>

... and I wonder whether we could then convert the arm64 variant into a
simple helper that just returns additional gfp flags.

We might even be able to call that from inside vma_alloc_folio(), to
just get __GFP_ZEROTAGS whenever the VMA has VM_MTE.

But that's obviously some additional work on top.

--
Cheers,

David