Re: [PATCH] mm: throttle and inc min_seq when both page types reach MIN_NR_GENS

From: Zhaoyang Huang
Date: Fri Oct 11 2024 - 21:50:25 EST


On Fri, Oct 11, 2024 at 4:02 PM Zhaoyang Huang <huangzhaoyang@xxxxxxxxx> wrote:
>
> On Fri, Oct 11, 2024 at 12:37 AM Yu Zhao <yuzhao@xxxxxxxxxx> wrote:
> >
> > On Wed, Oct 9, 2024 at 1:50 AM zhaoyang.huang <zhaoyang.huang@xxxxxxxxxx> wrote:
> > >
> > > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> > >
> > > The test case of [1] leads to system hang which caused by a local
> > > watchdog thread starved over 20s on a 5.5GB RAM ANDROID15(v6.6)
> > > system. This commit solve the issue by have the reclaimer be throttled
> > > and increase min_seq if both page types reach MIN_NR_GENS, which may
> > > introduce a livelock of switching type with holding lruvec->lru_lock.
> > >
> > > [1]
> > > launch below script 8 times simutanously which allocates 1GB virtual
> > > memory and access it from user space by each thread.
> > > $ costmem -c1024000 -b12800 -o0 &
> > >
> > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> > > ---
> > > mm/vmscan.c | 16 ++++++++++++++--
> > > 1 file changed, 14 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > > index cfa839284b92..83e450d0ce3c 100644
> > > --- a/mm/vmscan.c
> > > +++ b/mm/vmscan.c
> > > @@ -4384,11 +4384,23 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc,
> > > int remaining = MAX_LRU_BATCH;
> > > struct lru_gen_folio *lrugen = &lruvec->lrugen;
> > > struct mem_cgroup *memcg = lruvec_memcg(lruvec);
> > > + struct pglist_data *pgdat = lruvec_pgdat(lruvec);
> > >
> > > VM_WARN_ON_ONCE(!list_empty(list));
> > >
> > > - if (get_nr_gens(lruvec, type) == MIN_NR_GENS)
> > > - return 0;
> > > + if (get_nr_gens(lruvec, type) == MIN_NR_GENS) {
> > > + /*
> > > + * throttle for a while and then increase the min_seq since
> > > + * both page types reach the limit.
> > > + */
> >
> > Sorry but this isn't going to work because in try_to_inc_min_seq(), there is
> > `while (min_seq[type] + MIN_NR_GENS <= lrugen->max_seq) {`
> > to prevent reclaimers from evicting hot memory -- they need to do aging first.
> Thanks for heads up. What I thought was assuming there is a running
> reclaimer will do the aging and the throttled reclaimers increase the
> min_seq when scheduled back and move on. Or could we just drop the
> lock and throttle for a while to avoid a livelock on 'type = !type'
> with holding the lock?
please find below for the lru_lock contention information[2] which we
get from syzkaller test. if the patch[1] is worth discussing which
introduces throttling direct reclaimer by judging the number of
isolated folios.

[1]
https://lore.kernel.org/all/20240716094348.2451312-1-zhaoyang.huang@xxxxxxxxxx/

[2]
[ 295.163779][T8447@C5] preemptoff_warn: C5 T:<8447>syz.2.17
D:40.429ms F:295.123341s E:6.660 ms
[ 295.165000][T8447@C5] preemptoff_warn: C5 enabled preempt at:
[ 295.165000][T8447@C5] _raw_spin_unlock_irq+0x2c/0x5c
[ 295.165000][T8447@C5] evict_folios+0x2504/0x3050
[ 295.165000][T8447@C5] try_to_shrink_lruvec+0x40c/0x594
[ 295.165000][T8447@C5] shrink_one+0x174/0x4cc
[ 295.165000][T8447@C5] shrink_node+0x1c50/0x2088
[ 295.165000][T8447@C5] do_try_to_free_pages+0x560/0xef8
[ 295.165000][T8447@C5] try_to_free_pages+0x4e8/0xaf0
[ 295.165000][T8447@C5] __alloc_pages_slowpath+0x92c/0x1c78
[ 295.165000][T8447@C5] __alloc_pages+0x404/0x48c
[ 295.166277][T298@C0] C0 T:<298>logd.writer D:42.389ms F:295.123885s
[ 295.166337][T298@C0] C0 enabled IRQ at:
[ 295.166337][T298@C0] _raw_spin_unlock_irq+0x20/0x5c
[ 295.166337][T298@C0] evict_folios+0x2504/0x3050
[ 295.166337][T298@C0] shrink_one+0x174/0x4cc
[ 295.166337][T298@C0] shrink_node+0x1c50/0x2088
[ 295.166337][T298@C0] do_try_to_free_pages+0x560/0xef8
[ 295.166337][T298@C0] try_to_free_pages+0x4e8/0xaf0
[ 295.166337][T298@C0] __alloc_pages_slowpath+0x92c/0x1c78
[ 295.166337][T298@C0] __alloc_pages+0x404/0x48c
[ 295.166337][T298@C0] erofs_allocpage+0x90/0xb0
[ 295.167317][T298@C0] preemptoff_warn: C0 T:<298>logd.writer
D:43.424ms F:295.123888s
[ 295.168484][T8210@C7] C7 T:<8210>syz-executor D:32.816ms F:295.135666s
[ 295.168507][T8210@C7] C7 enabled IRQ at:
[ 295.168507][T8210@C7] _raw_spin_unlock_irq+0x20/0x5c
[ 295.168507][T8210@C7] evict_folios+0x2504/0x3050
[ 295.168507][T8210@C7] shrink_one+0x174/0x4cc
[ 295.168507][T8210@C7] shrink_node+0x1c50/0x2088
[ 295.168507][T8210@C7] do_try_to_free_pages+0x560/0xef8
[ 295.168507][T8210@C7] try_to_free_pages+0x4e8/0xaf0
[ 295.168507][T8210@C7] __alloc_pages_slowpath+0x92c/0x1c78
[ 295.168507][T8210@C7] __alloc_pages+0x404/0x48c
[ 295.168507][T8210@C7] __get_free_pages+0x24/0x3c
[ 295.168625][T8210@C7] preemptoff_warn: C7 T:<8210>syz-executor
D:32.956ms F:295.135666s
[ 295.168645][T8210@C7] preemptoff_warn: C7 enabled preempt at:
[ 295.168645][T8210@C7] _raw_spin_unlock_irq+0x2c/0x5c
[ 295.168645][T8210@C7] evict_folios+0x2504/0x3050
[ 295.168645][T8210@C7] try_to_shrink_lruvec+0x40c/0x594
[ 295.168645][T8210@C7] shrink_one+0x174/0x4cc
[ 295.168645][T8210@C7] shrink_node+0x1c50/0x2088
[ 295.168645][T8210@C7] do_try_to_free_pages+0x560/0xef8
[ 295.168645][T8210@C7] try_to_free_pages+0x4e8/0xaf0
[ 295.168645][T8210@C7] __alloc_pages_slowpath+0x92c/0x1c78
[ 295.168645][T8210@C7] __alloc_pages+0x404/0x48c
[ 295.178291][T8441@C2] C2 T:<8441>syz.3.18 D:42.290ms F:295.135998s
[ 295.178356][T8441@C2] C2 enabled IRQ at:
[ 295.178356][T8441@C2] _raw_spin_unlock_irq+0x20/0x5c
[ 295.178356][T8441@C2] evict_folios+0x2504/0x3050
[ 295.178356][T8441@C2] shrink_one+0x174/0x4cc
[ 295.178356][T8441@C2] shrink_node+0x1c50/0x2088
[ 295.178356][T8441@C2] do_try_to_free_pages+0x560/0xef8
[ 295.178356][T8441@C2] try_to_free_pages+0x4e8/0xaf0
[ 295.178356][T8441@C2] __alloc_pages_slowpath+0x92c/0x1c78
[ 295.178356][T8441@C2] __alloc_pages+0x404/0x48c
[ 295.178356][T8441@C2] bpf_ringbuf_alloc+0x22c/0x434
[ 295.179135][T8441@C2] preemptoff_warn: C2 T:<8441>syz.3.18
D:43.128ms F:295.136000s

>
> >
> > >
> > > + if (get_nr_gens(lruvec, !type) == MIN_NR_GENS) {
> > > + spin_unlock_irq(&lruvec->lru_lock);
> > > + reclaim_throttle(pgdat, VMSCAN_THROTTLE_ISOLATED);
> > > + spin_lock_irq(&lruvec->lru_lock);
> > > + try_to_inc_min_seq(lruvec, get_swappiness(lruvec, sc));
> > > + } else
> > > + return 0;
> > > + }
> > >
> > > gen = lru_gen_from_seq(lrugen->min_seq[type]);
> > >
> > > --
> > > 2.25.1
> > >