Re: [PATCH v9 13/14] mm: multi-gen LRU: admin guide

From: Yu Zhao
Date: Thu Mar 10 2022 - 19:37:55 EST


On Thu, Mar 10, 2022 at 5:30 AM Mike Rapoport <rppt@xxxxxxxxxx> wrote:
>
> Hi,
>
> On Tue, Mar 08, 2022 at 07:12:30PM -0700, Yu Zhao wrote:
> > Add an admin guide.
> >
> > Signed-off-by: Yu Zhao <yuzhao@xxxxxxxxxx>
> > Acked-by: Brian Geffon <bgeffon@xxxxxxxxxx>
> > Acked-by: Jan Alexander Steffens (heftig) <heftig@xxxxxxxxxxxxx>
> > Acked-by: Oleksandr Natalenko <oleksandr@xxxxxxxxxxxxxx>
> > Acked-by: Steven Barrett <steven@xxxxxxxxxxxx>
> > Acked-by: Suleiman Souhlal <suleiman@xxxxxxxxxx>
> > Tested-by: Daniel Byrne <djbyrne@xxxxxxx>
> > Tested-by: Donald Carr <d@xxxxxxxxxxxxxxx>
> > Tested-by: Holger Hoffstätte <holger@xxxxxxxxxxxxxxxxxxxxxx>
> > Tested-by: Konstantin Kharlamov <Hi-Angel@xxxxxxxxx>
> > Tested-by: Shuang Zhai <szhai2@xxxxxxxxxxxxxxxx>
> > Tested-by: Sofia Trinh <sofia.trinh@edi.works>
> > Tested-by: Vaibhav Jain <vaibhav@xxxxxxxxxxxxx>
> > ---
> > Documentation/admin-guide/mm/index.rst | 1 +
> > Documentation/admin-guide/mm/multigen_lru.rst | 146 ++++++++++++++++++
> > mm/Kconfig | 3 +-
> > 3 files changed, 149 insertions(+), 1 deletion(-)
> > create mode 100644 Documentation/admin-guide/mm/multigen_lru.rst
> >
> > diff --git a/Documentation/admin-guide/mm/index.rst b/Documentation/admin-guide/mm/index.rst
> > index c21b5823f126..2cf5bae62036 100644
> > --- a/Documentation/admin-guide/mm/index.rst
> > +++ b/Documentation/admin-guide/mm/index.rst
> > @@ -32,6 +32,7 @@ the Linux memory management.
> > idle_page_tracking
> > ksm
> > memory-hotplug
> > + multigen_lru
> > nommu-mmap
> > numa_memory_policy
> > numaperf
> > diff --git a/Documentation/admin-guide/mm/multigen_lru.rst b/Documentation/admin-guide/mm/multigen_lru.rst
> > new file mode 100644
> > index 000000000000..4ea6a801dc56
> > --- /dev/null
> > +++ b/Documentation/admin-guide/mm/multigen_lru.rst
> > @@ -0,0 +1,146 @@
> > +.. SPDX-License-Identifier: GPL-2.0
> > +
> > +=============
> > +Multi-Gen LRU
> > +=============
>
> I'm still missing an opening paragraph the explains what is Multi-gen LRU
> and why users would want it.
>
> Something like
>
> Multi-gen LRU is an efficient mechanism for page reclamation.
>
> More details are of course welcome :)

I've add the following for the next spin:

+Page reclaim decides the kernel's caching policy and ability to
+overcommit memory. It directly impacts the kswapd CPU usage and RAM
+efficiency. Multi-gen LRU aims to optimize page reclaim and improve
+performance under memory pressure.

> > +Quick start
> > +===========
> > +Build the kernel with the following configurations.
> > +
> > +* ``CONFIG_LRU_GEN=y``
> > +* ``CONFIG_LRU_GEN_ENABLED=y``
> > +
> > +All set!
> > +
> > +Runtime options
> > +===============
> > +``/sys/kernel/mm/lru_gen/`` contains stable ABIs described in the
> > +following subsections.
> > +
> > +Kill switch
> > +-----------
> > +``enable`` accepts different values to enable or disabled the
>
> ^ disable

Good catch. Will fix it up.

> > +following components. The default value of this file depends on
> > +``CONFIG_LRU_GEN_ENABLED``. All the components should be enabled
> > +unless some of them have unforeseen side effects. Writing to
> > +``enable`` has no effect when a component is not supported by the
> > +hardware, and valid values will be accepted even when the main switch
> > +is off.
> > +
> > +====== ===============================================================
> > +Values Components
> > +====== ===============================================================
> > +0x0001 The main switch for the multi-gen LRU.
> > +0x0002 Clearing the accessed bit in leaf page table entries in large
> > + batches, when MMU sets it (e.g., on x86). This behavior can
> > + theoretically worsen lock contention (mmap_lock). If it is
> > + disabled, the multi-gen LRU will suffer a minor performance
> > + degradation.
> > +0x0004 Clearing the accessed bit in non-leaf page table entries as
> > + well, when MMU sets it (e.g., on x86). This behavior was not
> > + verified on x86 varieties other than Intel and AMD. If it is
> > + disabled, the multi-gen LRU will suffer a negligible
> > + performance degradation.
> > +[yYnN] Apply to all the components above.
> > +====== ===============================================================
> > +
> > +E.g.,
> > +::
> > +
> > + echo y >/sys/kernel/mm/lru_gen/enabled
> > + cat /sys/kernel/mm/lru_gen/enabled
> > + 0x0007
> > + echo 5 >/sys/kernel/mm/lru_gen/enabled
> > + cat /sys/kernel/mm/lru_gen/enabled
> > + 0x0005
> > +
> > +Thrashing prevention
> > +--------------------
> > +Personal computers are more sensitive to thrashing because it can
> > +cause janks (lags when rendering UI) and negatively impact user
> > +experience. The multi-gen LRU offers thrashing prevention to the
> > +majority of laptop and desktop users who do not have ``oomd``.
> > +
> > +Users can write ``N`` to ``min_ttl_ms`` to prevent the working set of
> > +``N`` milliseconds from getting evicted. The OOM killer is triggered
> > +if this working set cannot be kept in memory. In other words, this
> > +option works as an adjustable pressure relief valve, and when open, it
> > +terminates applications that are hopefully not being used.
> > +
> > +Based on the average human detectable lag (~100ms), ``N=1000`` usually
> > +eliminates intolerable janks due to thrashing. Larger values like
> > +``N=3000`` make janks less noticeable at the risk of premature OOM
> > +kills.
>
> What is the default value of min_ttl_ms?

Right. I've added the following for the next spin:

+The default value ``0`` means disabled.