Re: [PATCH v5 00/14] mm/mglru: improve reclaim loop and dirty folio handling

From: Kairui Song

Date: Sat Apr 18 2026 - 04:17:30 EST

On Sat, Apr 18, 2026 at 3:38 PM wangzicheng <wangzicheng@xxxxxxxxx> wrote:
>
> > Hi Kairui,
> >
> > We have tested this patch series on Android device under a typical scenario.
> >
> > The test consisted of cold-starting multiple applications sequentially
> > under moderate system load (some services running on the background,
> > such as map navigating, AI voice-assistant). Each test round cold-starts
> > a fixed set of apps one by one and records the cold start latency.
> > A total of 100 rounds were conducted to ensure statistical significance.
> >
>
> Hi Xinyu and Kairui,
>
> We have test the patch under a **heavy** load benchmark for camera.
>
> > Before:
> > /proc/vmstat info:
> > pgpgin 269,224
> > pgpgout 226,078
> > workingset_refault_anon 237
> > workingset_refault_file 27689
> >
> > Launch Time Summary (all apps, all runs)
> > Mean 868.0ms
> > P50 888.0ms
> > P90 1274.2ms
> > P95 1399.0ms
> >
> > After:
> > /proc/vmstat info:
> > pgpgin 223,801 (-16.9%)
> > pgpgout 308,873
> > workingset_refault_anon 498
> > workingset_refault_file 17075 (-38.3%)
> >
> > Launch Time Summary (all apps, all runs)
> > Mean 850.5ms (-2.07%)
> > P50 861.5ms (-3.04%)
> > P90 1179.0ms (-8.05%)
> > P95 1228.0ms (-12.2%)
> >
> > --
> > Best regards,
> > Xinyu
> >
>
> We evaluated the backported patches on android16-6.12 using a **heavy**

Hi Zicheng

I'm not sure how you did that, this series applies on mm-unstable and
there is a large gap between that and 6.12.

> mobile workload on a Qualcomm 8850 device (16GB RAM + 16GB zram).
> (vmscan code in this tree is largely similar to v6.18)
>
> The workload simulates real user behavior by sequentially
> cold-starting 23 apps. For each application we perform the related
> operations (short‑video swiping, background music playback, and
> navigation). After exiting one application the next is launched
> immediately in 1s. After all apps complete, the camera is launched
> and a photo is taken.
>
> Baseline and patched kernels were tested under identical conditions.
> (with a fan kept cooling the testbed)
> Full system traces were collected for three runs in each
> configuration, and ten additional traces were recorded for the final
> camera launch stage.
>
> Overall application keepalive behavior shows no noticeable
> difference. However, we observed performance deviations in some
> memory‑pressure scenarios.
>
> Before:
> Meminfo (100 ms per sample, average result)
> MemAvailable: 5420
> MemFree: 1421
> Cached: 3862
> AnonPages: 3804
> Dirty: 62
> vmstat counters (last sample)
> pgpgin: 3,701,869
> pgpgout: 3,545,058
> workingset_refault_anon: 390,967
> workingset_refault_file: 79,927
> Total app launch time (23 apps + launcher × 23): 7702 ms
> Camera launch time: 684 ms
>
> After:
> Meminfo (100 ms per sample, average result)
> MemAvailable: 5058 (-7%)
> MemFree: 1382 (-3%)
> Cached: 3213 (-17%)
> AnonPages: 3637 (-4%)
> Dirty: 35 (-44%)
> vmstat counters (last sample)
> pgpgin: 5,752,429 (+55%)
> pgpgout: 3,668,788 (+3%)
> workingset_refault_anon: 1,492,964 (+282%)
> workingset_refault_file: 590,505 (+639%)
> Total app launch time (23 apps + launcher × 23): 8872 ms (+15%)
> Among the tested apps, 11 improved while 14 regressed.
> Camera launch time: 980 ms (+43%), which is also the stage with the
> highest memory pressure.
>
> From whole trace analysis, direct reclaim appears to run slower.
> Before v.s. after
> total duration: 11659 ms / 57006 ms

Being 5 times slower seems really horrible, but I'm not sure what is
causing that as there seems to be very few dirty folios in your test
case. I knew there are some vendor hook for android, and since now
MGLRU is using the common routine, so these hooks are also affecting
MGLRU but the modules ain't aware of that which is causing strange
behavior?

> total reclaimed: 3953 MB / 6344 MB
> speed: 0.339 MB/ms / 0.111 MB/ms
> times: 16117 / 27562
>
> The performance might behave differently on devices with smaller memory
> (e.g. 8–16GB) compared to servers with 100+GB memory, or under
> moderate to heavy memory pressure.
> Could this be related to patch 09/14[1] which removes folio_inc_gen()
> when ` writeback || (type == LRU_GEN_FILE && dirty)`?
>
> Any comments or suggestions would be appreciated.

Can you share the code you actually tested or maybe test in on
mm-unstable / mm-unstable + this series? Or how can we reproduce that?
Or maybe some full log or dump of lru_gen info and vmstat?

>
> [1] https://lore.kernel.org/linux-mm/20260413-mglru-reclaim-v5-0-8eaeacbddc44@xxxxxxxxxxx/T/#m568eba84d35d8d5ff519d3e29237de6d64f67659
>
> Best,
> Zicheng
>