Re: [PATCH v2 6/9] x86/clear_huge_page: multi-page clearing

From: Raghavendra K T
Date: Wed Sep 13 2023 - 02:44:17 EST


On 8/31/2023 12:19 AM, Ankur Arora wrote:
clear_pages_rep(), clear_pages_erms() clear using string instructions.
While clearing extents of more than a single page, we can use these
more effectively by explicitly advertising the region-size to the
processor.

This can be used as a hint by the processor-uarch to optimize the
clearing (ex. to avoid polluting one or more levels of the data-cache.)

As a secondary benefit, string instructions are typically microcoded,
and so it's a good idea to amortize the cost of the decode across larger
regions.

Accordingly, clear_huge_page() now does huge-page clearing in three
parts: the neighbourhood of the faulting address, the left, and the
right region of the neighbourhood.

The local neighbourhood is cleared last to keep its cachelines hot.

[...]

Signed-off-by: Ankur Arora <ankur.a.arora@xxxxxxxxxx>
---
arch/x86/mm/hugetlbpage.c | 54 +++++++++++++++++++++++++++++++++++++++
1 file changed, 54 insertions(+)


Hello Ankur,

Just thinking loud here (w.r.t THP).

V3 patchset with uarch changes had changes in THP path too, where
one could explicitly give hints or non-caching hints. and they are
passed down to call incoherent clearing.

IMO, those changes logically belong to uarch optimizations.. right?