Re: [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped

From: Barry Song

Date: Sat Feb 28 2026 - 23:16:37 EST


On Sat, Feb 28, 2026 at 6:28 PM wangzicheng <wangzicheng@xxxxxxxxx> wrote:
>
> Hi Barry,
> >
> > I find your concern a bit surprising. If I understand correctly,
> > you’re observing that file folios are currently being over-reclaimed.
> > In that case, placing hot pages at the tail might make them harder
> > to reclaim after PTE scanning (since they may still be young), but
> > this seems to violate the fundamental principle of LRU. Moreover,
> > when scanning encounters young file folios, reclaim will simply
> > continue scanning more folios to find reclaimable ones, so scanning
> > hot folios only wastes CPU time.
> > Since read-ahead cold folios are placed at the head, relatively hotter
> > folios may be reclaimed instead, causing refaults and further triggering
> > reclaim, which can worsen the situation.
> >
> Thank you for the detailed explanation.
> > >
> > > We'll test this when available and report back. We hope to have a
> > > chance to discuss this topic at LSF/MM/BPF.
> > >
> >
> > Sure, thanks!
> >
> > Barry
>
> For evaluation I’m using a workload that repeatedly cold-starts and
> drives same user actions in 20+ apps on Android.
> I’m comparing baseline(v6.6) vs. the patched kernel and watching
> `/proc/vmstat -> workingset_refault_file`, expecting it to go down.
>
> I ran 3 runs per kernel, but `workingset_refault_file` is quite noisy,
> the Coefficient of Variation is around 40%, so the result doesn’t look
> statistically solid.
>
> Do you have any suggestions on how to measure the benefit more
> robustly? For example:
> - different or longer-running workloads,
> - better normalization for refaults (per time, per faults, etc.),
> - or other vmstat metrics that you found more stable in practice?

I've cc'ed Tangquan, and he may be able to share how he was testing.
Basically, you may want to disable Wi-Fi, as it can introduce a lot of
variability between runs. Aside from refault metrics, you should also
see reduced I/O load and fewer swap-out/in events if you run the same
sequence of apps consistently.

>
> I’m also considering increasing the number of runs and using a t-test,
> or comparing the CDF between baseline and patched kernels.
> If you have a preferred methodology, I’d like to align with that.
>

Thanks
Barry