Re: [PATCH] mm/mglru: Use folio_mark_accessed to replace folio_set_active in PF

From: Axel Rasmussen

Date: Mon Apr 27 2026 - 14:26:45 EST


For what it's worth, I agree with this change in principle.

In production we set fault_around_bytes to 4096. That setting is
surprisingly load-bearing (i.e. if I change it, even at a small
experimental scale, I expect workloads to notice and complain). So I
don't think I have an easy way to test this change under production
workloads.

Like Andrew said the workload in the commit message doesn't seem
unreasonable, and the benefit is large.

I guess the workload that would see a downside from this is one that
heavily uses readahead pages but also generates many "one-time-use"
pages instead of maintaining a "fixed" working set. Without activating
the readahead pages, does it lose some of the readahead benefit
because they are pushed out?

About the Sashiko comments, the tier bits being cleared doesn't seem
that problematic to me. However, the WORKINGSET_ACTIVATE counter issue
seems worth fixing.

On Mon, Apr 27, 2026 at 7:46 AM Pedro Falcato <pfalcato@xxxxxxx> wrote:
>
> On Sun, Apr 26, 2026 at 12:35:46PM +0800, Barry Song wrote:
> > On Fri, Apr 24, 2026 at 11:19 PM Pedro Falcato <pfalcato@xxxxxxx> wrote:
> > >
> > > On Sat, Apr 18, 2026 at 08:02:33PM +0800, Barry Song (Xiaomi) wrote:
> > > > MGLRU gives high priority to folios mapped in page tables.
> > > > As a result, folio_set_active() is invoked for all folios
> > > > read during page faults. In practice, however, readahead
> > > > can bring in many folios that are never accessed via page
> > > > tables.
> > > >
> > > > A previous attempt by Lei Liu proposed introducing a separate
> > > > LRU for readahead[1] to make readahead pages easier to reclaim,
> > > > but that approach is likely over-engineered.
> > >
> > > Why does this even need to be kept? I'm not sure it makes sense
> > > to even mark readahead folios as referenced.
> > >
> > > I'd suggest folios should only be marked referenced (or even active, whatever)
> > > when they're mapped. Anything else is a bit random and is hoping you are
> > > eventually going to map them in the future (which is not true for, for example,
> > > anything in an ELF file that may be readahead but not mapped, like debug info,
> > > symbol tables, section headers, relocation tables, etc etc)
> >
> > The patch targets the mmap readahead path rather than the syscall
> > readahead path.
>
> Yes.
>
> >
> > With lru_gen_in_fault() in place, it’s roughly equivalent to
> > the mapped case, since readahead is typically 128 KB while
> > fault_around is 64 KB in PF.
>
> I'm not sure I understand. How is 128KB roughly equivalent to 64KB?
> That's almost a 2x difference!
>
> And readahead, as of now, will read beyond the VMA's limits (which will not
> be mapped).
>
> Really, it's extremely unclear why this adhoc heuristic is here. A folio isn't
> active just because you started readahead for it inside a page fault. It's not
> even necessarily active just because it's mapped (although that is more
> understandable).
>
> And of course, because the whole ordeal is already extremely simple, it turns
> out that in_lru_fault doesn't actually mean "we're inside a page fault" but
> "we're inside a page fault and this VMA/file don't have any sequential/random
> annotations".
>
> I would really like to understand why this is here (and this is true for the
> various odd heuristics mglru uses) if we ever want to have a chance to merge
> the two LRUs together.

This heuristic was added in the original MGLRU implementation, which
is quite a large patch. If it was added later in response to some
specific issue/use case, we'd have more rationale written down. I
think it really just comes down to this line in the original commit
message: "A page is added to the youngest generation on faulting." - I
don't see evidence that a great deal of thought was put into the point
Barry is raising here.


>
> --
> Pedro