Re: [PATCH] mm/mglru: Use folio_mark_accessed to replace folio_set_active in PF

From: Pedro Falcato

Date: Mon Apr 27 2026 - 10:47:36 EST


On Sun, Apr 26, 2026 at 12:35:46PM +0800, Barry Song wrote:
> On Fri, Apr 24, 2026 at 11:19 PM Pedro Falcato <pfalcato@xxxxxxx> wrote:
> >
> > On Sat, Apr 18, 2026 at 08:02:33PM +0800, Barry Song (Xiaomi) wrote:
> > > MGLRU gives high priority to folios mapped in page tables.
> > > As a result, folio_set_active() is invoked for all folios
> > > read during page faults. In practice, however, readahead
> > > can bring in many folios that are never accessed via page
> > > tables.
> > >
> > > A previous attempt by Lei Liu proposed introducing a separate
> > > LRU for readahead[1] to make readahead pages easier to reclaim,
> > > but that approach is likely over-engineered.
> >
> > Why does this even need to be kept? I'm not sure it makes sense
> > to even mark readahead folios as referenced.
> >
> > I'd suggest folios should only be marked referenced (or even active, whatever)
> > when they're mapped. Anything else is a bit random and is hoping you are
> > eventually going to map them in the future (which is not true for, for example,
> > anything in an ELF file that may be readahead but not mapped, like debug info,
> > symbol tables, section headers, relocation tables, etc etc)
>
> The patch targets the mmap readahead path rather than the syscall
> readahead path.

Yes.

>
> With lru_gen_in_fault() in place, it’s roughly equivalent to
> the mapped case, since readahead is typically 128 KB while
> fault_around is 64 KB in PF.

I'm not sure I understand. How is 128KB roughly equivalent to 64KB?
That's almost a 2x difference!

And readahead, as of now, will read beyond the VMA's limits (which will not
be mapped).

Really, it's extremely unclear why this adhoc heuristic is here. A folio isn't
active just because you started readahead for it inside a page fault. It's not
even necessarily active just because it's mapped (although that is more
understandable).

And of course, because the whole ordeal is already extremely simple, it turns
out that in_lru_fault doesn't actually mean "we're inside a page fault" but
"we're inside a page fault and this VMA/file don't have any sequential/random
annotations".

I would really like to understand why this is here (and this is true for the
various odd heuristics mglru uses) if we ever want to have a chance to merge
the two LRUs together.

--
Pedro