Re: [PATCH v2 0/8] Introduce a huge-page pre-zeroing mechanism

From: David Hildenbrand (Red Hat)

Date: Wed Jan 21 2026 - 07:51:22 EST

On 1/20/26 19:18, Gregory Price wrote:

On Tue, Jan 20, 2026 at 06:39:48PM +0800, Li Zhe wrote:

On Tue, 20 Jan 2026 09:47:44 +0000, david.laight.linux@xxxxxxxxx wrote:

On Tue, 20 Jan 2026 14:27:06 +0800
"Li Zhe" <lizhe.67@xxxxxxxxxxxxx> wrote:

Am I missing something?
If userspace does:
$ program_a; program_b
and pages used by program_a are zeroed when it exits you get the delay
for zeroing all the pages it used before program_b starts.
OTOH if the zeroing is deferred program_b only needs to zero the pages
it needs to start (and there may be some lurking).

Under the init_on-free approach, improving the speed of zeroing may
indeed prove necessary.

However, I believe we should first reach consensus on adopting
“init_on_free” as the solution to slow application startup before
turning to performance tuning.

His point was init_on_free may not actually reduce any delays on serial
applications, and can actually introduce additional delays.

Example
-------
program_a: alloc_hugepages(10);
exit();

program b: alloc_hugepages(5);
exit();

/* Run programs in serial */
sh: program_a && program_b

in zero_on_alloc():
program_a eats zero(10) cost on startup
program_b eats zero(5) cost on startup
Overall zero(15) cost to start program_b

in zero_on_free()
program_a eats zero(10) cost on startup
program_a eats zero(10) cost on exit
program_b eats zero(0) cost on startup
Overall zero(20) cost to start program_b

zero_on_free is worse by zero(5)
-------

This is a trivial example, but it's unclear zero_on_free actually
provides a benefit. You have to know ahead of time what the runtime
behavior, pre-zeroed count, and allocation pattern (0->10->5->...) would
be to determine whether there's an actual reduction in startup time.

For VMs with hugetlb people usually have some spare pages lying around. VM startup time is more important for cloud providers than VM shutdown time.

I'm sure there are examples where it is the other way around, but having mixed workloads on the system is likely not the highest priority right now.

But just trivially, starting from the base case of no pages being
zeroed, you're just injecting an additional zero(X) cost if program_a()
consumes more hugepages than program_b().

And whatever you do,

program_a()
program_b()

will have to zero the pages.

No asynchronous mechanism will really help.

Long way of saying the shift from alloc to free seems heuristic-y and
you need stronger analysis / better data to show this change is actually
beneficial in the general case.

I think the principle of "the allocator already contains zeroed pages" is quite universal and simple.

Whether you want to zero the pages actually when the last reference is gone (like we do in the buddy), or have that happen from some asynchonrous context is an rather an internal optimization.

--
Cheers

David