答复: [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped

From: 郑堂权(Blues Zheng)

Date: Fri Mar 20 2026 - 05:59:16 EST


Hi Zicheng,

We ran the same RFC on 6.6, 8 GB, with zstd in our internal whole-system perf model. /proc/vmstat (before → after; % = reduction):
pgpgin 57807848 55738480 −3.58%
pgpgout 31585160 26367420 −16.52%
pswpin 2305528 1534481 −33.44%
pswpout 6618935 5327316 −19.51%
workingset_refault_anon 2104047 1356316 −35.54%
workingset_refault_file 9020966 8407346 −6.80%
workingset_activate_anon 1196828 412937 −65.50%
workingset_activate_file 2941357 1468218 −50.08%
workingset_restore_anon 590337 412322 −30.15%
workingset_restore_file 1801398 1285060 −28.66%
workingset_nodereclaim 201014 152864 −23.95%

Here both file and anon refault drop—different from your Android run, likely workload/environment.



-----邮件原件-----
发件人: wangzicheng <wangzicheng@xxxxxxxxx>
发送时间: 2026年3月19日 18:13
收件人: Barry Song <21cnbao@xxxxxxxxx>
抄送: akpm@xxxxxxxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Suren Baghdasaryan <surenb@xxxxxxxxxx>; Lei Liu <liulei.rjpt@xxxxxxxx>; Matthew Wilcox (Oracle) <willy@xxxxxxxxxxxxx>; Axel Rasmussen <axelrasmussen@xxxxxxxxxx>; Yuanchu Xie <yuanchu@xxxxxxxxxx>; Wei Xu <weixugc@xxxxxxxxxx>; Kairui Song <kasong@xxxxxxxxxxx>; 郑堂权(Blues Zheng) <zhengtangquan@xxxxxxxx>; wangtao <tao.wangtao@xxxxxxxxx>; liulu 00013167 <liulu.liu@xxxxxxxxx>
主题: RE: [PATCH RFC] mm/mglru: lazily activate folios while folios are really mapped

[You don't often get email from wangzicheng@xxxxxxxxx. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ]

外部邮件/External Mail


Hi Barry,

Thank you for the suggestion.

I have re-designed the workload and get the relative promising results.
The workload repeatedly launches and switches between 30 apps for 500 rounds. Since the test takes quite a long time, the final results appear relatively stable across runs.

The testing was done on an Android 16 device with kernel 6.6.89, 8GB RAM, MGLRU enabled.

However, the results are not very easy to interpret.

Average number of kept-alive apps: ±0.08 apps Average available memory (sampled after each app launch):
baseline vs patched: 2216MB vs 2218MB (~2MB difference)

Below is the vmstat comparison (patched vs baseline):

Metric Change
--------------------------- --------
pgpgin +2.06%
pgpgout +3.10%
pswpin +14.13%
pswpout +4.55%
pgfault -3.19%
pgmajfault +12.75%
workingset_refault_anon +14.77%
workingset_refault_file +3.48%
workingset_activate_anon -3.45%
workingset_activate_file -17.76%
workingset_restore_anon -3.44%
workingset_restore_file -19.13%

In v6.6, when PG_active is set, pages go to the youngest generation, while pages without PG_active go to the second oldest generation.
```
static inline bool lru_gen_add_folio(
...
if (folio_test_active(folio))
seq = lrugen->max_seq;
...
else
seq = lrugen->min_seq[type] + 1; ```

My rough expectation was that the patch should make file pages more prone to reclaim and make file page hot/cold aging more accurate, so both file refault and anon refault might decrease. But here anon refault increases instead.

I’m not sure if this assumption is correct. Could you share your thoughts on how to interpret these results?

Thanks,
Zicheng

> -----Original Message-----
> From: owner-linux-mm@xxxxxxxxx <owner-linux-mm@xxxxxxxxx> On Behalf Of
> Barry Song
> Sent: Sunday, March 1, 2026 12:16 PM
> To: wangzicheng <wangzicheng@xxxxxxxxx>
> Cc: akpm@xxxxxxxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Suren Baghdasaryan <surenb@xxxxxxxxxx>; Lei
> Liu <liulei.rjpt@xxxxxxxx>; Matthew Wilcox (Oracle)
> <willy@xxxxxxxxxxxxx>; Axel Rasmussen <axelrasmussen@xxxxxxxxxx>;
> Yuanchu Xie <yuanchu@xxxxxxxxxx>; Wei Xu <weixugc@xxxxxxxxxx>; Kairui
> Song <kasong@xxxxxxxxxxx>; Tangquan Zheng <zhengtangquan@xxxxxxxx>;
> wangtao <tao.wangtao@xxxxxxxxx>
> Subject: Re: [PATCH RFC] mm/mglru: lazily activate folios while folios
> are really mapped
>
> On Sat, Feb 28, 2026 at 6:28 PM wangzicheng <wangzicheng@xxxxxxxxx>
> wrote:
> >
> > Hi Barry,
> > >
> > > I find your concern a bit surprising. If I understand correctly,
> > > you’re observing that file folios are currently being over-reclaimed.
> > > In that case, placing hot pages at the tail might make them harder
> > > to reclaim after PTE scanning (since they may still be young), but
> > > this seems to violate the fundamental principle of LRU. Moreover,
> > > when scanning encounters young file folios, reclaim will simply
> > > continue scanning more folios to find reclaimable ones, so
> > > scanning hot folios only wastes CPU time.
> > > Since read-ahead cold folios are placed at the head, relatively
> > > hotter folios may be reclaimed instead, causing refaults and
> > > further triggering reclaim, which can worsen the situation.
> > >
> > Thank you for the detailed explanation.
> > > >
> > > > We'll test this when available and report back. We hope to have
> > > > a chance to discuss this topic at LSF/MM/BPF.
> > > >
> > >
> > > Sure, thanks!
> > >
> > > Barry
> >
> > For evaluation I’m using a workload that repeatedly cold-starts and
> > drives same user actions in 20+ apps on Android.
> > I’m comparing baseline(v6.6) vs. the patched kernel and watching
> > `/proc/vmstat -> workingset_refault_file`, expecting it to go down.
> >
> > I ran 3 runs per kernel, but `workingset_refault_file` is quite
> > noisy, the Coefficient of Variation is around 40%, so the result
> > doesn’t look statistically solid.
> >
> > Do you have any suggestions on how to measure the benefit more
> > robustly? For example:
> > - different or longer-running workloads,
> > - better normalization for refaults (per time, per faults, etc.),
> > - or other vmstat metrics that you found more stable in practice?
>
> I've cc'ed Tangquan, and he may be able to share how he was testing.
> Basically, you may want to disable Wi-Fi, as it can introduce a lot of
> variability between runs. Aside from refault metrics, you should also
> see reduced I/O load and fewer swap-out/in events if you run the same
> sequence of apps consistently.
>
> >
> > I’m also considering increasing the number of runs and using a
> > t-test, or comparing the CDF between baseline and patched kernels.
> > If you have a preferred methodology, I’d like to align with that.
> >
>
> Thanks
> Barry

________________________________
OPPO

本电子邮件及其附件含有OPPO公司的保密信息,仅限于邮件指明的收件人(包含个人及群组)使用。禁止任何人在未经授权的情况下以任何形式使用。如果您错收了本邮件,切勿传播、分发、复制、印刷或使用本邮件之任何部分或其所载之任何内容,并请立即以电子邮件通知发件人并删除本邮件及其附件。
网络通讯固有缺陷可能导致邮件被截留、修改、丢失、破坏或包含计算机病毒等不安全情况,OPPO对此类错误或遗漏而引致之任何损失概不承担责任并保留与本邮件相关之一切权利。
除非明确说明,本邮件及其附件无意作为在任何国家或地区之要约、招揽或承诺,亦无意作为任何交易或合同之正式确认。 发件人、其所属机构或所属机构之关联机构或任何上述机构之股东、董事、高级管理人员、员工或其他任何人(以下称“发件人”或“OPPO”)不因本邮件之误送而放弃其所享之任何权利,亦不对因故意或过失使用该等信息而引发或可能引发的损失承担任何责任。
文化差异披露:因全球文化差异影响,单纯以YES\OK或其他简单词汇的回复并不构成发件人对任何交易或合同之正式确认或接受,请与发件人再次确认以获得明确书面意见。发件人不对任何受文化差异影响而导致故意或错误使用该等信息所造成的任何直接或间接损害承担责任。
This e-mail and its attachments contain confidential information from OPPO, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you are not the intended recipient, please do not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
Electronic communications may contain computer viruses or other defects inherently, may not be accurately and/or timely transmitted to other systems, or may be intercepted, modified ,delayed, deleted or interfered. OPPO shall not be liable for any damages that arise or may arise from such matter and reserves all rights in connection with the email.
Unless expressly stated, this e-mail and its attachments are provided without any warranty, acceptance or promise of any kind in any country or region, nor constitute a formal confirmation or acceptance of any transaction or contract. The sender, together with its affiliates or any shareholder, director, officer, employee or any other person of any such institution (hereinafter referred to as "sender" or "OPPO") does not waive any rights and shall not be liable for any damages that arise or may arise from the intentional or negligent use of such information.
Cultural Differences Disclosure: Due to global cultural differences, any reply with only YES\OK or other simple words does not constitute any confirmation or acceptance of any transaction or contract, please confirm with the sender again to ensure clear opinion in written form. The sender shall not be responsible for any direct or indirect damages resulting from the intentional or misuse of such information.