Re: [PATCH 0/8] Introduce a huge-page pre-zeroing mechanism
From: Li Zhe
Date: Mon Dec 29 2025 - 07:32:35 EST
On Sat, 27 Dec 2025 08:21:16 +0100, mjguzik@xxxxxxxxx wrote:
> In the name of "provide tools, not policy" making userspace call the
> shots is the right approach, which I advocated for in the original
> thread.
Thank you for your endorsement!
> I do have concerns about the specific interface as I think it is a
> little too limited.
>
> Suppose vastly different deployments with different needs. For example
> one may want to keep at least n pages ready to use, RAM permitting.
>
> At the same time it perhaps would like to balance CPU usage vs other
> tasks, so for example it would control parallelism based on observed
> churn rate.
>
> So a toolset I would consider viable would need to provide an extensible
> interface to future-proof it.
>
> As for an immediate need not met with the current patchset, there is no
> configurable threshold for free zeroed page count to generate a wake up.
>
> I suspect a bunch of ioctls would be needed here.
>
> I don't know if sysfs is viable at all for this. Worst case a device (or
> a set of per-node devices) can be created with the same goal.
In my view, the present kernel framework does not allow an ioctl
interface to be placed under the per-node huge-page directories.
The functionality you describe appears to align closely with that
offered by the cgroup.event_control interface in the memory
controller.
We could therefore introduce a new event_control file for huge-page
events, following the same pattern. Given that all huge-page
attributes already live in sysfs, such an addition would keep the
interface consistent and avoid the extra indirection of a new
/dev/hugepagectl file.
Thanks,
Zhe