Re: [PATCH RFC v3 01/19] mm: thread user_addr through page allocator for cache-friendly zeroing
From: David Hildenbrand (Arm)
Date: Thu Apr 23 2026 - 10:14:19 EST
On 4/23/26 15:42, Gregory Price wrote:
> On Thu, Apr 23, 2026 at 07:57:01AM -0400, Michael S. Tsirkin wrote:
>> On Thu, Apr 23, 2026 at 11:46:56AM +0200, David Hildenbrand (Arm) wrote:
>>
>> Changes we have to add: 8 changes
>> Rename 4 existing APIs adding _user: __alloc_pages,
>> __folio_alloc, folio_alloc_mpol, __alloc_frozen_pages + add
>> 4 wrapper macros/inlines in gfp.h that forward to the _user variants with
>> USER_ADDR_NONE. Roughly 6-8 lines of boilerplate per API.
>>
>
> This is essentially what i was proposing.
>
> The result would be more external surface to the buddy (at least 2 maybe
> more functions, plus all the _noprof stuff and a bunch of other things).
We probably wouldn't need prof stuff if all we care about is vma_alloc_folio()
that will only be calling _noprof internally.
>
> And then all the callers have to be updated anywhere, and not make it
> any harder to mess up. You either have
>
> old_interface(..., user_addr);
> /* Churn everything using the old interface */
>
> or
>
> __internal(..., user_addr) { }
>
> old_interface(...) {
> __internal(..., NO_USER_ADDR);
> }
>
> new_interface(..., user_addr) {
> __internal(..., user_addr);
> }
>
> /*
> * Update some callsites to use new_interface()
> * mostly a bunch of _mpol() functions but some others
> */
>
> but someone could still just as easily do
>
> old_interface();
> /* don't fall folio_zero_user() - Bug! */
>
> So it's not like this makes things any harder to mess up.
Right.
Looking at the patch, there would already be less churn if
folio_alloc_mpol()
__alloc_frozen_pages_noprof()
__alloc_pages()
post_alloc_hook()
Would have simple wrappers.
hugetlb calls __alloc_frozen_pages(). That's rather nasty, given that
it wants to allocate a frozen folio.
alloc_hugetlb_folio() already consumes vma+addr.
alloc_buddy_hugetlb_folio_with_mpol() already consumes vma+addr.
Maybe we could forward the vma+addr here and call a vma_alloc_froze_folio() if
we have a VMA+addr to have a clean interface.
But really, that hugetlb code is rather messy. I'd vote for leaving hugetlb
alone on a v1, and focusing on non-hugetlb first.
--
Cheers,
David