Re: [PATCH v6 0/3] mm: Free contiguous order-0 pages efficiently

From: David Hildenbrand (Arm)

Date: Wed Apr 29 2026 - 09:07:41 EST

On 4/29/26 14:31, Ryan Roberts wrote:
> On 29/04/2026 13:04, Andrew Morton wrote:
>> On Wed, 29 Apr 2026 06:33:26 -0400 Johannes Weiner <hannes@xxxxxxxxxxx> wrote:
>>
>>>
>>> I think we should revert the original patch.
>>>
>>> The premise is that we can save some allocator calls by requesting
>>> higher orders and splitting them up into singles. This is a frivolous
>>> and short-sighted use of a very coveted and expensive resource.
>
> I'm not sure it's that simple. First off, vmalloc has preferred to allocate high
> order pages for quite a while, it's just that the patch you're referring to
> makes it try even harder. So reverting the patch doesn't completely revert the
> behaviour, it just reduces it.
>
> Performance benefits because those high order pages are mapped appropriately in
> the page table - i.e. 1G PUD, 2M PMD, (or 64K CONTPTE on arm64). So it's not
> solely about the number of cycles spent in the allocator; the HW is used more
> efficiently. vmalloc only splits to order-0 for the benefit of the caller,
> because there are some places that assume they can access each returned struct page.
>
> And all the order-0 pages of the original high order page are freed at the same
> time, so it's not like we are destroying the contiguous resource; it remains
> intact for the next user (well, ignoring that some will be freed to the pcpu
> list - this series solves that wrinkle). I've heard it argued that this approach
> is actually _better_ for conserving contiguous blocks because it's keeping the
> lifetime of all the constituent pages bound together and reducing fragmentation.
> I've never seen any data though...

Right, that's what Willy has said: allocating+freeing larger blocks, especially
for unmovable data, reduces fragmentation as a whole. And that theory makes
sense for me in the context here.

I don't think we want to revert the original patch.

--
Cheers,

David