Re: [PATCH] mm/mglru: Use folio_mark_accessed to replace folio_set_active in PF

From: Barry Song

Date: Sun Apr 26 2026 - 17:56:34 EST


On Sat, Apr 25, 2026 at 1:03 AM Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
>
> On Sat, Apr 18, 2026 at 08:02:33PM +0800, Barry Song (Xiaomi) wrote:
> > MGLRU gives high priority to folios mapped in page tables.
> > As a result, folio_set_active() is invoked for all folios
> > read during page faults. In practice, however, readahead
> > can bring in many folios that are never accessed via page
> > tables.
> >
> > A previous attempt by Lei Liu proposed introducing a separate
> > LRU for readahead[1] to make readahead pages easier to reclaim,
> > but that approach is likely over-engineered.
> >
> > Before commit 4d5d14a01e2c ("mm/mglru: rework workingset
> > protection"), folios with PG_active were always placed in
> > the youngest generation, leading to over-protection and
> > increased refaults. After that commit, PG_active folios
> > are placed in the second youngest generation, which is
> > still too optimistic given the presence of readahead. In
> > contrast, the classic active/inactive scheme is more
> > conservative.
> >
> > This patch switches to folio_mark_accessed(). If
> > folio_check_references() later detects referenced PTEs,
> > the folio will be promoted based on the reference flag
> > set by folio_mark_accessed().
> >
>
> There is a following comment and stat update in lru_gen_refault() which is
> referring to setting active bit which this patch is removing.
>
> /* see folio_add_lru() where folio_set_active() will be called */
> if (lru_gen_in_fault())
> mod_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE + type, delta);
>
> Is this still relevant or need changes?

This seems like a very good question. From a counting
perspective, there is no impact — in MGLRU, no code depends
on the counting to make decisions, so it is fine. However,
it raises the question of whether we should proactively
call folio_set_active() for some refaulted folios.

In the classic active/inactive case, we mark recently
refaulted folios as active. workingset_test_recent()
measures the refault distance. If the distance is less than
workingset_size, we mark the refaulted folio as active
to protect it.

if (!workingset_test_recent(shadow, file, &workingset, true))
goto out;

folio_set_active(folio);
workingset_age_nonresident(lruvec, nr);
mod_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE + file, nr);

In MGLRU, we compare the current max_seq with the historical
max_seq. If the gap is less than MAX_NR_GENS,
lru_gen_test_recent() considers it recent:

static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
unsigned long *token, bool
*workingset, bool file)
{
int memcg_id;
unsigned long max_seq;
struct mem_cgroup *memcg;
struct pglist_data *pgdat;

unpack_shadow(shadow, &memcg_id, &pgdat, token, workingset);

memcg = mem_cgroup_from_private_id(memcg_id);
*lruvec = mem_cgroup_lruvec(memcg, pgdat);

max_seq = READ_ONCE((*lruvec)->lrugen.max_seq);
max_seq &= (file ? EVICTION_MASK : EVICTION_MASK_ANON) >>
LRU_REFS_WIDTH;

return abs_diff(max_seq, *token >> LRU_REFS_WIDTH) < MAX_NR_GENS;
}

But the existing code never marks any folios other than those
read from PF as active. Instead, MGLRU unconditionally treats
PF as important and non-PF as unimportant. That is what we are
addressing in this patch. We do not think PF-read folios are
always important.

Maybe we can test a “very recent” case to emulate the classic
LRU workingset_test_recent(). Once it is true, we set the
folio to active, regardless of where it originally came from.

lru_gen_test_refault_hot()
{
...
return abs_diff(max_seq, *token >> LRU_REFS_WIDTH) <= MIN_NR_GENS;
}

diff --git a/mm/workingset.c b/mm/workingset.c
index 07e6836d0502..aaf873101091 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -319,9 +319,16 @@ static void lru_gen_refault(struct folio *folio,
void *shadow)

atomic_long_add(delta, &lrugen->refaulted[hist][type][tier]);

- /* see folio_add_lru() where folio_set_active() will be called */
- if (lru_gen_in_fault())
- mod_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE +
type, delta);
+ /*
+ * If the folio was evicted within the recent MIN_GENS, it is
+ * considered very hot and should be protected.
+ */
+ if (lru_gen_test_refault_hot(folio)) {
+ folio_set_active(folio);
+ mod_lruvec_state(lruvec,
+ WORKINGSET_ACTIVATE_BASE + type,
+ delta);
+ }

if (workingset) {
folio_set_workingset(folio);

Maybe this will handle both sys-call folios and PF folios in a more
sensible way, rather than simply treating PF as high priority.

>
> I have not yet dig deeper into the patch and the heuristic. Will do later.

Thanks
Barry