Re: [PATCH RFC] mm/vmscan:Fix the hot/cold inversion when swappiness = 0 or 201

From: Barry Song

Date: Tue Apr 07 2026 - 19:00:44 EST


On Tue, Apr 7, 2026 at 10:26 PM Kairui Song <ryncsn@xxxxxxxxx> wrote:
>
> On Tue, Apr 07, 2026 at 01:37:08PM +0800, wangzhen wrote:
> > >From ac731b061f152cba05b9aa351652a04f933986e0 Mon Sep 17 00:00:00 2001
> > From: w00021541 <wangzhen5@xxxxxxxxxxx>
> > Date: Tue, 7 Apr 2026 16:17:53 +0800
> > Subject: [PATCH RFC] mm/vmscan:Fix the hot/cold inversion when swappiness = 0 or 201
> >
> > In some cases, when swappiness is set to 0 or 201, the oldest generation pages will be changed to the newest generation incorrectly.
> >
> > Consider the following aging scenario:
> > MAX_NR_GENS=4, MIN_NR_GENS=2, swappiness=201, 3 anon gens, 4 file gens.
> > 1. When swappiness = 201, should_run_aging will only check anon type.
> > should_run_aging return true.
> > 2. In inc_max_seq, if the anon and file type have MAX_NR_GENS, inc_min_seq will move the oldest generation pages to the second oldest to prepare for increasing max_seq.
> > Here, the file type will enter inc_min_seq.
> > 3. In inc_min_seq, first goto is true, the pages migration was skipped, resulting in the inversion of cold/hot pages.
> >
> > In fact, when MAX_NR_GENS=4 and MIN_NR_GENS=2, the for loop after the goto is unreachable.
> >
> > Consider the code in inc_max_seq:
> > if (get_nr_gens(lruvec, type) ! = MAX_NR_GENS)
> > continue;
> > This means that only get_nr_gens==4 can enter the inc_min_seq.
> >
> > Discuss the swappiness in three different scenarios:
> > 1<=swappiness<=200:
> > If should_run_aging returns true, both anon and file types must satisfy get_nr_gens<=3, indicating that no type satisfies get_nr_gens==MAX_NR_GENS.
> > Therefore, both cannot enter inc_min_seq.
> >
> > swappiness=201:
> > If should_run_aging returns true, the anon type must satisfy get_nr_gens<=3. Only file type can satisfy get_nr_gens==MAX_NR_GENS.
> > After entering inc_min_seq, type && (swappiness == SWAPPINESS_ANON_ONLY) is true, the for loop will be skipped.
> >
> > swappiness=0:
> > Same as swappiness=201
> >
> > so the two goto statements should be removed. This ensures that when swappiness=0 or 201, the oldest generation pages are correctly promoted to the second oldest generation.
> > (When 1<= swappiness<=200, only both anon and file types get_nr_gens<=3 will age, preventing the inversion of hot/cold pages).
> >
> > Signed-off-by: w00021541 <wangzhen5@xxxxxxxxxxx>
> > ---
> > mm/vmscan.c | 14 +++-----------
> > 1 file changed, 3 insertions(+), 11 deletions(-)
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 0fc9373e8251..54c835b07d3e 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -3843,7 +3843,7 @@ static void clear_mm_walk(void)
> > kfree(walk);
> > }
> >
> > -static bool inc_min_seq(struct lruvec *lruvec, int type, int swappiness)
> > +static bool inc_min_seq(struct lruvec *lruvec, int type)
> > {
> > int zone;
> > int remaining = MAX_LRU_BATCH;
> > @@ -3851,14 +3851,6 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, int swappiness)
> > int hist = lru_hist_from_seq(lrugen->min_seq[type]);
> > int new_gen, old_gen = lru_gen_from_seq(lrugen->min_seq[type]);
> >
> > - /* For file type, skip the check if swappiness is anon only */
> > - if (type && (swappiness == SWAPPINESS_ANON_ONLY))
> > - goto done;
> > -
> > - /* For anon type, skip the check if swappiness is zero (file only) */
> > - if (!type && !swappiness)
> > - goto done;
> > -
>
> Hi, thanks for the patch.
>
> We have a very similar patch internally, and the result is kind of bad.
>
> Currently MGLRU forbid the gen distance between file and anon go larger
> than 2, which mean with this patch, when under great pressure, you may
> have to keep rotating a long list of the opposite type of folios to
> reclaim another type.
>
> For example, when you have only 2 gens of file folios, swap disabled,
> and there are 3 gens of anon folios. Anon folios are unevictable because
> there is no SWAP. And file is also unevcitable due to force protection
> of gen. Consider anon folios are mostly cold (at least a portion of them
> are), now the oldest gen of anon folios will be very long (e.g. 12G,
> 3145728 folios).
>
> Now, to reclaim any file folios, you have to age first. Before this
> patch that is usually fast. But after this, it will have to rotate
> all 3145728 folios to second oldest anon gen, will could take a
> very long time.
>
> During that period any concurrent reclaimer will get rejected
> due to force protection, result in very ugly long tailing or
> unexpected OOM.
>
> So I agree this is a good idea in general, I agree we should do
> this. But better defer this until we patch up MGLRU to remove
> the force protection first.

I suspect that once we can age file and anonymous pages
separately, this issue will resolve itself. David already has
some code for this [1].

Not sure when he will have time to push it upstream, but I
may carve out some time to take care of it this month.

[1] https://lore.kernel.org/linux-mm/aam5nOyXs1sNdjTe@xxxxxxxxxx/

>
> But I think it might be reasonable to remove the SWAPPINESS_ANON_ONLY
> limit now, that can only be triggered by proactive reclaim
> which would tolerate long tailing and won't cause OOM.

It may be better to defer both cases until file and anonymous
pages can be aged separately.

Thanks
Barry