Re: [PATCH v3 0/4] mm/folio_zero_user: add multi-page clearing

From: Zi Yan
Date: Tue Apr 15 2025 - 15:10:26 EST


On 13 Apr 2025, at 23:46, Ankur Arora wrote:

> This series adds multi-page clearing for hugepages. It is a rework
> of [1] which took a detour through PREEMPT_LAZY [2].
>
> Why multi-page clearing?: multi-page clearing improves upon the
> current page-at-a-time approach by providing the processor with a
> hint as to the real region size. A processor could use this hint to,
> for instance, elide cacheline allocation when clearing a large
> region.
>
> This optimization in particular is done by REP; STOS on AMD Zen
> where regions larger than L3-size use non-temporal stores.
>
> This results in significantly better performance.

Do you have init_on_alloc=1 in your kernel?
With that, pages coming from buddy allocator are zeroed
in post_alloc_hook() by kernel_init_pages(), which is a for loop
of clear_highpage_kasan_tagged(), a wrap of clear_page().
And folio_zero_user() is not used.

At least Debian, Fedora, and Ubuntu by default have
CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y, which means init_on_alloc=1.

Maybe kernel_init_pages() should get your optimization as well,
unless you only target hugetlb pages.

Best Regards,
Yan, Zi