Re: [External] Re: [PATCH] mm: add swappiness=max arg to memory.reclaim for only anon reclaim

From: Zhongkun He
Date: Wed Mar 19 2025 - 08:53:44 EST


On Wed, Mar 19, 2025 at 1:29 PM Yosry Ahmed <yosry.ahmed@xxxxxxxxx> wrote:
>
> On Wed, Mar 19, 2025 at 10:34:54AM +0800, Zhongkun He wrote:
> > On Tue, Mar 18, 2025 at 10:10 PM Yosry Ahmed <yosry.ahmed@xxxxxxxxx> wrote:
> > >
> > > On Tue, Mar 18, 2025 at 09:53:30PM +0800, Zhongkun He wrote:
> > > > With this patch 'commit <68cd9050d871> ("mm: add swappiness= arg to
> > > > memory.reclaim")', we can submit an additional swappiness=<val> argument
> > > > to memory.reclaim. It is very useful because we can dynamically adjust
> > > > the reclamation ratio based on the anonymous folios and file folios of
> > > > each cgroup. For example,when swappiness is set to 0, we only reclaim
> > > > from file folios.
> > > >
> > > > However,we have also encountered a new issue: when swappiness is set to
> > > > the MAX_SWAPPINESS, it may still only reclaim file folios.
> > > >
> > > > So, we hope to add a new arg 'swappiness=max' in memory.reclaim where
> > > > proactive memory reclaim only reclaims from anonymous folios when
> > > > swappiness is set to max. The swappiness semantics from a user
> > > > perspective remain unchanged.
> > > >
> > > > For example, something like this:
> > > >
> > > > echo "2M swappiness=max" > /sys/fs/cgroup/memory.reclaim
> > > >
> > > > will perform reclaim on the rootcg with a swappiness setting of 'max' (a
> > > > new mode) regardless of the file folios. Users have a more comprehensive
> > > > view of the application's memory distribution because there are many
> > > > metrics available. For example, if we find that a certain cgroup has a
> > > > large number of inactive anon folios, we can reclaim only those and skip
> > > > file folios, because with the zram/zswap, the IO tradeoff that
> > > > cache_trim_mode or other file first logic is making doesn't hold -
> > > > file refaults will cause IO, whereas anon decompression will not.
> > > >
> > > > With this patch, the swappiness argument of memory.reclaim has a new
> > > > mode 'max', means reclaiming just from anonymous folios both in traditional
> > > > LRU and MGLRU.
> > >
> > > Is MGLRU handled in this patch?
> >
> > Yes, The value of ONLY_ANON_RECLAIM_MODE is 201, and the MGLRU select the
> > evictable type like this:
> >
> > #define evictable_min_seq(min_seq, swappiness) \
> > min((min_seq)[!(swappiness)], (min_seq)[(swappiness) <= MAX_SWAPPINESS])
> >
> > #define for_each_evictable_type(type, swappiness) \
> > for ((type) = !(swappiness); (type) <= ((swappiness) <=
> > MAX_SWAPPINESS); (type)++)
> >
> > if the swappiness=0, the type is LRU_GEN_FILE(1);
> >
> > if the swappiness=201 (>MAX_SWAPPINESS),
> > for ((type) = 0; (type) <= 0); (type)++)
> > The type is always LRU_GEN_ANON(0).
>
> Zhongkun, I see that you already sent a new version. Please wait until
> discussions on a patch are resolved before sending out newer versions,
> and allow more time for reviews in general.

Got it, thanks.

>
> I think this is too subtle, and it's easy to miss. Looking at the MGLRU
> code it seems like there's a lot of swappiness <= MAX_SWAPPINESS checks,
> and I am not sure why these already exist given that swappiness should
> never exceed MAX_SWAPPINESS before this change.
>
> Are there other parts of the MGLRU code that are already using
> swappiness values > MAX_SWAPPINESS?

IIUC, The MGLRU can already use the value of MAX_SWAPPINESS + 1 to
reclaim only anonymous folios. Please have a look:
lru_gen_seq_write()->run_cmd():
else if (swappiness > MAX_SWAPPINESS + 1)
goto done; /*so MAX_SWAPPINESS + 1 is OK */

in inc_min_seq():
if (type ? swappiness > MAX_SWAPPINESS : !swappiness)
goto done; //skip LRU_GEN_FILE when swappiness is
//MAX_SWAPPINESS + 1

//Skip LRU_GEN_FILE when swappiness is MAX_SWAPPINESS + 1.
#define for_each_evictable_type(type, swappiness) \
for ((type) = !(swappiness); (type) <= ((swappiness) <=
MAX_SWAPPINESS); (type)++)

So the /sys/kernel/debug/lru_gen can accept the value of swappiness + 1
for proactive reclamation, meaning it only reclaims anonymous pages.

But the above statement is just my guess. It would be great if Yu could clarify.
If my description is incorrect, please correct me.

>
> Yu, could you help us making things clearer here? I would like to avoid
> relying on current implementation details that could easily be missed
> when making changes. Ideally we'd explicitly check for
> SWAPPINESS_ANON_ONLY.
>

Looking forward to Yu's reply.

Thanks.