Re: [PATCH] mm/mglru: Use folio_mark_accessed to replace folio_set_active in PF

From: Barry Song (Xiaomi)

Date: Mon Apr 27 2026 - 21:36:32 EST


On Tue, Apr 28, 2026 at 2:23 AM Axel Rasmussen <axelrasmussen@xxxxxxxxxx> wrote:
>
> For what it's worth, I agree with this change in principle.
>
> In production we set fault_around_bytes to 4096. That setting is
> surprisingly load-bearing (i.e. if I change it, even at a small
> experimental scale, I expect workloads to notice and complain). So I
> don't think I have an easy way to test this change under production
> workloads.
>
> Like Andrew said the workload in the commit message doesn't seem
> unreasonable, and the benefit is large.
>
> I guess the workload that would see a downside from this is one that
> heavily uses readahead pages but also generates many "one-time-use"
> pages instead of maintaining a "fixed" working set. Without activating
> the readahead pages, does it lose some of the readahead benefit
> because they are pushed out?
>
> About the Sashiko comments, the tier bits being cleared doesn't seem
> that problematic to me. However, the WORKINGSET_ACTIVATE counter issue
> seems worth fixing.
>

I am considering something more reasonable than simply
"fixing" the counter. Right now, MGLRU unconditionally
treats PF folios as WORKINGSET_ACTIVATE_BASE and neglects
other folios entirely. I am thinking of a better approach
that detects true recency. In the active/inactive case,
this is refault_distance < workingset_size.

In MGLRU, we might detect whether reclamation occurred
within the most recent one or two generations. I am
queuing the following for testing:

diff --git a/mm/workingset.c b/mm/workingset.c
index 07e6836d0502..8b552b3d7e37 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -271,10 +271,11 @@ static void *lru_gen_eviction(struct folio *folio)
* Fills in @lruvec, @token, @workingset with the values unpacked from shadow.
*/
static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
- unsigned long *token, bool *workingset, bool file)
+ unsigned long *token, bool *workingset, bool file,
+ unsigned long *gen_distance)
{
int memcg_id;
- unsigned long max_seq;
+ unsigned long max_seq, distance;
struct mem_cgroup *memcg;
struct pglist_data *pgdat;

@@ -286,7 +287,10 @@ static bool lru_gen_test_recent(void *shadow, struct lruvec **lruvec,
max_seq = READ_ONCE((*lruvec)->lrugen.max_seq);
max_seq &= (file ? EVICTION_MASK : EVICTION_MASK_ANON) >> LRU_REFS_WIDTH;

- return abs_diff(max_seq, *token >> LRU_REFS_WIDTH) < MAX_NR_GENS;
+ distance = abs_diff(max_seq, *token >> LRU_REFS_WIDTH);
+ if (gen_distance)
+ *gen_distance = distance;
+ return distance < MAX_NR_GENS;
}

static void lru_gen_refault(struct folio *folio, void *shadow)
@@ -294,7 +298,7 @@ static void lru_gen_refault(struct folio *folio, void *shadow)
bool recent;
int hist, tier, refs;
bool workingset;
- unsigned long token;
+ unsigned long token, distance;
struct lruvec *lruvec;
struct lru_gen_folio *lrugen;
int type = folio_is_file_lru(folio);
@@ -302,7 +306,8 @@ static void lru_gen_refault(struct folio *folio, void *shadow)

rcu_read_lock();

- recent = lru_gen_test_recent(shadow, &lruvec, &token, &workingset, type);
+ recent = lru_gen_test_recent(shadow, &lruvec, &token, &workingset, type,
+ &distance);
if (lruvec != folio_lruvec(folio))
goto unlock;

@@ -319,9 +324,11 @@ static void lru_gen_refault(struct folio *folio, void *shadow)

atomic_long_add(delta, &lrugen->refaulted[hist][type][tier]);

- /* see folio_add_lru() where folio_set_active() will be called */
- if (lru_gen_in_fault())
+ /* If the folio was reclaimed very recently. */
+ if (distance <= MIN_LRU_GENS) {
+ folio_set_active(folio);
mod_lruvec_state(lruvec, WORKINGSET_ACTIVATE_BASE + type, delta);
+ }

if (workingset) {
folio_set_workingset(folio);
@@ -442,7 +449,7 @@ bool workingset_test_recent(void *shadow, bool file, bool *workingset,

rcu_read_lock();
recent = lru_gen_test_recent(shadow, &eviction_lruvec, &eviction,
- workingset, file);
+ workingset, file, NULL);
rcu_read_unlock();
return recent;
}