Re: [PATCH v5 00/14] mm/mglru: improve reclaim loop and dirty folio handling
From: Kairui Song
Date: Fri Apr 17 2026 - 14:03:36 EST
On Fri, Apr 17, 2026 at 10:53 AM wangxinyu19 <wxy2009nrrr@xxxxxxx> wrote:
>
> On Mon, 13 Apr 2026 00:48:14 +0800, Kairui Song wrote:
> > This series is based on mm-unstable, also applies to mm-new.
> >
> > This series cleans up and slightly improves MGLRU's reclaim loop and
> > dirty writeback handling. As a result, we can see an up to ~30% increase
> > in some workloads like MongoDB with YCSB and a huge decrease in file
> > refault, no swap involved. Other common benchmarks have no regression,
> > and LOC is reduced, with less unexpected OOM, too.
> >
> > Some of the problems were found in our production environment, and
> > others were mostly exposed while stress testing during the development
> > of the LSM/MM/BPF topic on improving MGLRU [1]. This series cleans up
> > the code base and fixes several performance issues, preparing for
> > further work.
> >
> > MGLRU's reclaim loop is a bit complex, and hence these problems are
> > somehow related to each other. The aging, scan number calculation, and
> > reclaim loop are coupled together, and the dirty folio handling logic is
> > quite different, making the reclaim loop hard to follow and the dirty
> > flush ineffective.
>
> > This series slightly cleans up and improves these issues using a scan
> > budget by calculating the number of folios to scan at the beginning of
> > the loop, and decouples aging from the reclaim calculation helpers.
> > Then, move the dirty flush logic inside the reclaim loop so it can kick
> > in more effectively. These issues are somehow related, and this series
> > handles them and improves MGLRU reclaim in many ways.
> >
> > Test results: All tests are done on a 48c96t NUMA machine with 2 nodes
> > and a 128G memory machine using NVME as storage.
>
> Hi Kairui,
Hello Xinyu,
> After:
> /proc/vmstat info:
> pgpgin 223,801 (-16.9%)
> pgpgout 308,873
> workingset_refault_anon 498
> workingset_refault_file 17075 (-38.3%)
>
> Launch Time Summary (all apps, all runs)
> Mean 850.5ms (-2.07%)
> P50 861.5ms (-3.04%)
> P90 1179.0ms (-8.05%)
> P95 1228.0ms (-12.2%)
Thanks a lot for testing! Results are looking good, fewer refaults and
pgin, better performance. pgout is a bit higher, maybe because
retaining the flags on dirty protected folios helped identify or
protect more file folios, or maybe anon / cold file reclaim is more
effective since writeback pending folios are activated to the hottest
gen instead of stuck on tail gen; or maybe the reclaim loop is better
structured so there are less wasted loops and unnecessary reclaim to
slab. In any case, it's a good thing. Will mention this in the next
update.