Re: New design

From: David Hildenbrand (Arm)

Date: Tue Jun 09 2026 - 04:28:29 EST


On 6/9/26 05:58, Matthew Wilcox wrote:
> OK, here's how I'd structure this:
>
> 1. Introduce PG_zeroed for buddy pages
> 2. Set it if init_on_free is set
> 3. Set it from balloon driver
>
> https://lore.kernel.org/lkml/c7094de807c0e963526686e1d245bc76193b1a92.1776689093.git.mst@xxxxxxxxxx/
>
> but add FPI_ZEROED instead of an extra bool parameter.
>
> 4. Introduce page_is_zeroed like this:
>
> static inline bool page_is_zeroed(const struct page *page)
> {
> /*
> * lru.next has bit 2 set if the page is already zeroed.
> * Callers may simply overwrite it once they no longer
> * need to preserve that information.
> */
> return (unsigned long)page->lru.next & BIT(2);
> }
>
> (you'll notice this is similar to page_is_pfmemalloc() but it doesn't
> need to be in mm.h)
>
> This step is going to be a bit fiddly. We weren't expecting to return
> multiple flags in page->lru.next, so clear_page_pfmemalloc() just sets
> page->lru.next to NULL. So somewhere we need to make sure that
> page->lru.next is definitely NULL, and then allow both the zeroed and
> pfmemalloc flags to be set in it.
>
> The important part of this is that it allows the zeroed flag to be
> returned from the page allocator without introducing pghint_t like you
> did in v2.

I previously raised (in v2? not sure) that we could using a pageflag that are
only used for folios, and then simply clear that flag on the folio allocation
path such that we don't get false-postives with the bit set.

>
> 5. Now you can start skipping various zeroing steps higher in the call
> chain.
>
> I understand David's disgust with vma_alloc_zeroed_movable_folio()
> but that is surely a separate cleanup and nothing to do with this
> patchset.

Well, in my reality, we're just finding interesting ways to work around the fact
that GFP_ZERO sometimes does what we want, sometimes doesn't.

So we leak information out of the buddy to really only handle one scenario:
fixing up GFP_ZERO currently sometimes not doing what we want.

I'm afraid we couldn't use the above trick to punch zeroed pages back into the
buddy: some random user doing alloc+use+free would be unaware that there is a
bit to clear.

So I assume really only folio allocation would make use of this, to work around
our problematic GFP_ZERO implementation.

--
Cheers,

David