Re: 回复: [PATCH v2] mm: add lazyfree folio to lru tail

From: Barry Song
Date: Tue Sep 10 2024 - 04:52:17 EST


On Thu, Aug 29, 2024 at 3:55 PM gaoxu <gaoxu2@xxxxxxxxx> wrote:
>
> > On Tue, Aug 27, 2024 at 04:07:57AM +0000, gaoxu wrote:
> > > >
> > > > On Mon, Aug 26, 2024 at 12:55 PM Barry Song <21cnbao@xxxxxxxxx>
> > wrote:
> > > > >
> > > > > On Tue, Aug 27, 2024 at 4:37 AM Lokesh Gidra <lokeshgidra@xxxxxxxxxx>
> > > > wrote:
> > > > > >
> > > > > > Thanks Suren for looping in
> > > > > >
> > > > > > On Fri, Aug 23, 2024 at 4:39 PM Suren Baghdasaryan
> > <surenb@xxxxxxxxxx>
> > > > wrote:
> > > > > > >
> > > > > > > On Wed, Aug 21, 2024 at 2:47 PM Barry Song <21cnbao@xxxxxxxxx>
> > > > wrote:
> > > > > > > >
> > > > > > > > On Wed, Aug 21, 2024 at 8:46 PM Michal Hocko
> > <mhocko@xxxxxxxx>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > On Fri 16-08-24 07:48:01, gaoxu wrote:
> > > > > > > > > > Replace lruvec_add_folio with lruvec_add_folio_tail in the
> > > > lru_lazyfree_fn:
> > > > > > > > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FILE list. If
> > it's
> > > > > > > > > > moved to the LRU tail, it allows for faster release lazy-free
> > folio
> > > > and
> > > > > > > > > > reduces the impact on file refault.
> > > > > > > > >
> > > > > > > > > This has been discussed when MADV_FREE was introduced. The
> > > > question was
> > > > > > > > > whether this memory has a lower priority than other inactive
> > memory
> > > > that
> > > > > > > > > has been marked that way longer ago. Also consider several
> > > > MADV_FREE
> > > > > > > > > users should they be LIFO from the reclaim POV?
> > > > > >
> > > > > > Thinking from the user's perspective, it seems to me that FIFO within
> > > > > > MADV_FREE'ed pages makes more sense. As a user I expect the longer a
> > > > > > MADV_FREE'ed page hasn't been touched, the chances are higher that it
> > > > > > may not be around anymore.
> > > > > > > >
> > > > >
> > > > > Hi Lokesh,
> > > > > Thanks!
> > > > >
> > > > > > > > The priority of this memory compared to other inactive memory that
> > has
> > > > been
> > > > > > > > marked for a longer time likely depends on the user's expectations -
> > How
> > > > soon
> > > > > > > > do users expect MADV_FREE to be reclaimed compared with old file
> > > > folios.
> > > > > > > >
> > > > > > > > art guys moved to MADV_FREE from MADV_DONTNEED without any
> > > > > > > > useful performance data and reason in the changelog:
> > > > > > > > https://android-review.googlesource.com/c/platform/art/+/2633132
> > > > > > > >
> > > > > > > > Since art is the Android Java heap, it can be quite large. This increases
> > the
> > > > > > > > likelihood of packing the file LRU and reduces the chances of
> > reclaiming
> > > > > > > > anonymous memory, which could result in more file re-faults while
> > > > helping
> > > > > > > > anonymous folio persist longer in memory.
> > > > > >
> > > > > > Individual heaps of android apps are not big, and even in there we
> > > > > > don't call MADV_FREE on the entire heap.
> > > > >
> > > > > How do you define "Individual heaps of android apps", do you know the
> > usual
> > > > > total_size for a phone with memory pressure by running multiple apps and
> > > > how
> > > > > much for each app?
> > > > >
> > > > Every app is a separate process and therefore has its own private ART
> > > > heap. Those numbers that you are asking vary drastically. But here's
> > > > what I can tell you:
> > > >
> > > > Max heap size for an app is 512MB typically. But it is rarely entirely
> > > > used. Typical heap usage is 50MB to 250MB. But as I said, not all of
> > > > it is MADV_FREE'ed. Only those pages which are freed after GC
> > > > compaction are.
> > > > > > > >
> > > > > > > > I am really curious why art guys have moved to MADV_FREE if we
> > have
> > > > > > > > an approach to reach them.
> > > > > >
> > > > > > Honestly, it makes little sense as a user that calling MADV_FREE on an
> > > > > > anonymous mapping will impact file LRU. That was never the intention
> > > > > > with our ART change.
> > > > > >
> > > > >
> > > > > This is just how MADV_FREE is implemented in the kernel, this kind of
> > lazyfree
> > > > > anon folios are moved to file but *NOT* anon LRU.
> > > > >
> > > > > > From our perspective, once a set of pages are MADV_FREE'ed, they are
> > > > > > like a page-cache. It gives an opportunity, without hurting memory
> > > > > > use, to avoid overhead of page-faults, which happen frequently after
> > > > > > GC is done on running apps.
> > > > > >
> > > > > > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be
> > > > > > prioritized for reclamation over file ones.
> > > > >
> > > > > This is exactly what this patch is doing, putting lazyfree anon folios
> > > > > to the tail of file LRU so that they can be reclaimed earlier than file
> > > > > folios. But the question is: is the requirement "MADV_FREE'ed pages
> > > > > should be prioritized for reclamation over file ones" universally true for
> > > > > all other non-Android users?
> > > > >
> > > > That's definitely an important question to get answered. But putting
> > > > my users hat on again, by explicitly MADV_FREE'ing we ask for that
> > > > behavior. IMHO, MADV_FREE'ed pages should be the first ones to be
> > > > reclaimed on memory pressure.
> > > For non-Android systems, perhaps the author of MADV_FREE can provide a
> > more
> > > reasonable opinion;
> > >
> > > Add Minchan Kim.
> > > Please forgive me for forgetting to add you when sending the patch.
> >
> > AFAIR, there were two concerns:
> >
> > 1. The file LRU would contain pages used only once.
> >
> > While MADV_FREE allows discarding pages under memory pressure, the system
> > would
> > still have non-working set pages within the file LRU (e.g., those used only once).
> >
> >
> > 2. LRU inversion among MADV_FREE users.
> >
> > Consider this time order:
> >
> > 1. A process: MADV_FREE
> > 2. B process: MADV_FREE
> > 3. C process: MADV_FREE
> >
> > The moving tail approach would discard the most recent pages from Process C
> > first,
> > instead of those from Process A.
> >
> > Of course, this isn't universally true for all workloads, but it's the reality.
> After enabling MGLRU, the implementation of age and evict based on gen dilutes the FIFO mechanism. Although the joining time points are different, they are all reclaimed based on the same gen.
> Android has always been plagued by performance issues caused by high IO. Many engineers adjust strategies to prefer reclaiming anon when the system is low on memory. For the same reason,
> we believe lazy free folio should prioritize file reclamation.(If I misunderstood, please correct me.)
>
> For other discussions that lean towards reclaiming anon folio, please refer to:
> https://patchwork.kernel.org/project/linux-mm/cover/20231108065818.19932-1-link@xxxxxxxx/
>
> Adding lazyfree folio to the LRU tail has no impact on the Android system, allowing the system to normally utilize the reuse of MADV_FREE when not in a low mem state.
> If added to the file LRU head, the Android system will encounter various issues such as high IO and heavy kswapd load, forcing us to prohibit Android ART from continuing to use MADV_FREE.
> Adding lazyfree folio to the LRU tail is not the best approach, but it is more acceptable compared to adding it to the LRU head.
>
> >
> > At the time, I proposed introducing an additional "ez_reclaimable" LRU list to
> > store MADV_FREE pages
> > (and potentially other hinted pages in the future).
> > This would allow differentiating priority among LRU lists based on knobs or
> > heuristics.
> This solution looks good, might need to think about how to adapt it to mglru.

That's right. Adapting to MGLRU isn't straightforward. We might need a separate
generation smaller than min_seq for this, or alternatively, it could
be handled by
a separate LRU list that isn't tied to any MGLRU generation. both seem hard.

> > However, this idea wasn't well-received.