Re: [PATCH v5 3/4] mm: add docs for per-order mTHP counters and transhuge_page ABI

From: Ryan Roberts
Date: Fri Apr 12 2024 - 06:20:09 EST


On 12/04/2024 08:37, Barry Song wrote:
> From: Barry Song <v-songbaohua@xxxxxxxx>
>
> This patch includes documentation for mTHP counters and an ABI file
> for sys-kernel-mm-transparent-hugepage, which appears to have been
> missing for some time.
>
> Signed-off-by: Barry Song <v-songbaohua@xxxxxxxx>
> Cc: Chris Li <chrisl@xxxxxxxxxx>
> Cc: David Hildenbrand <david@xxxxxxxxxx>
> Cc: Domenico Cerasuolo <cerasuolodomenico@xxxxxxxxx>
> Cc: Kairui Song <kasong@xxxxxxxxxxx>
> Cc: Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>
> Cc: Peter Xu <peterx@xxxxxxxxxx>
> Cc: Ryan Roberts <ryan.roberts@xxxxxxx>
> Cc: Suren Baghdasaryan <surenb@xxxxxxxxxx>
> Cc: Yosry Ahmed <yosryahmed@xxxxxxxxxx>
> Cc: Yu Zhao <yuzhao@xxxxxxxxxx>
> Cc: Jonathan Corbet <corbet@xxxxxxx>

A few nits, but regardless:

Reviewed-by: Ryan Roberts <ryan.roberts@xxxxxxx>

> ---
> .../sys-kernel-mm-transparent-hugepage | 17 +++++++++++
> Documentation/admin-guide/mm/transhuge.rst | 28 +++++++++++++++++++
> 2 files changed, 45 insertions(+)
> create mode 100644 Documentation/ABI/testing/sys-kernel-mm-transparent-hugepage
>
> diff --git a/Documentation/ABI/testing/sys-kernel-mm-transparent-hugepage b/Documentation/ABI/testing/sys-kernel-mm-transparent-hugepage
> new file mode 100644
> index 000000000000..80dde0fd576c
> --- /dev/null
> +++ b/Documentation/ABI/testing/sys-kernel-mm-transparent-hugepage
> @@ -0,0 +1,17 @@
> +What: /sys/kernel/mm/hugepages/

Err, transparent_hugepage, right? copy/paste error?

> +Date: April 2024
> +Contact: Barry Song <baohua@xxxxxxxxxx>

Looks like a bunch of mm sysfs interfaces use:

Contact: Linux memory management mailing list <linux-mm@xxxxxxxxx>

I'll leave that up to you!

> +Description:
> + /sys/kernel/mm/transparent_hugepage/ contains a number of files and
> + subdirectories,
> + - defrag
> + - enabled
> + - hpage_pmd_size
> + - khugepaged
> + - shmem_enabled
> + - use_zero_page
> + - subdirectories of the form hugepages-<size>kB, where <size>
> + is the page size of the hugepages supported by the kernel/CPU
> + combination.
> +
> + See Documentation/admin-guide/mm/transhuge.rst for details.> diff --git a/Documentation/admin-guide/mm/transhuge.rst
b/Documentation/admin-guide/mm/transhuge.rst
> index 04eb45a2f940..f436ff982f22 100644
> --- a/Documentation/admin-guide/mm/transhuge.rst
> +++ b/Documentation/admin-guide/mm/transhuge.rst
> @@ -447,6 +447,34 @@ thp_swpout_fallback
> Usually because failed to allocate some continuous swap space
> for the huge page.
>
> +In /sys/kernel/mm/transparent_hugepage/hugepages-<size>kB/stats, There are
> +also individual counters for each huge page size, which can be utilized to
> +monitor the system's effectiveness in providing huge pages for usage. Each
> +counter has its own corresponding file.
> +
> +anon_fault_alloc
> + is incremented every time a huge page is successfully
> + allocated and charged to handle a page fault.
> +
> +anon_fault_fallback
> + is incremented if a page fault fails to allocate or charge
> + a huge page and instead falls back to using huge pages with
> + lower orders or small pages.
> +
> +anon_fault_fallback_charge
> + is incremented if a page fault fails to charge a huge page and
> + instead falls back to using huge pages with lower orders or
> + small pages even though the allocation was successful.
> +
> +anon_swpout
> + is incremented every time a huge page is swapout in one

nit: swapout -> "swapped out"? Although I see this is just a copy/paste of the
description of the existing counter...

> + piece without splitting.
> +
> +anon_swpout_fallback
> + is incremented if a huge page has to be split before swapout.
> + Usually because failed to allocate some continuous swap space
> + for the huge page.
> +
> As the system ages, allocating huge pages may be expensive as the
> system uses memory compaction to copy data around memory to free a
> huge page for use. There are some counters in ``/proc/vmstat`` to help