Re: [PATCH v4 04/14] mm/mglru: restructure the reclaim loop

From: Kairui Song

Date: Wed Apr 08 2026 - 04:43:51 EST

On Wed, Apr 08, 2026 at 04:08:05PM +0800, Chen Ridong wrote:
> On 2026/4/7 19:57, Kairui Song via B4 Relay wrote:
> > +/*
> > + * For future optimizations:
> > + * 1. Defer try_to_inc_max_seq() to workqueues to reduce latency for memcg
> > + * reclaim.
> > + */
> > static bool try_to_shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
> > {
> > + bool need_rotate = false;
> > long nr_batch, nr_to_scan;
> > - unsigned long scanned = 0;
> > int swappiness = get_swappiness(lruvec, sc);
> > + struct mem_cgroup *memcg = lruvec_memcg(lruvec);
> > +
> > + nr_to_scan = get_nr_to_scan(lruvec, sc, memcg, swappiness);
> > + if (!nr_to_scan)
> > + need_rotate = true;
> >
>
> Will it be simpler if we return directly here?
>
> if (!nr_to_scan)
> return ture;

Looks good to me, I used `need_rotate = true` here since it kind of explains
what is happening better.

>
> I wonder if moving the aging check under `while (nr_to_scan > 0)` can change
> behavior when the scan budget gets shifted down to 0.
>
> In the old code, once `should_run_aging()` became true, reclaim could still go
> through `try_to_inc_max_seq()` instead of being gated by the priority-shifted
> scan budget. With this change, a small lruvec can skip the loop entirely, so a
> lruvec that needs aging to make reclaim progress would neither scan nor age in
> that reclaim round.
>
> Does this have any observable impact on reclaim progress or reclaim balance,
> e.g. by deferring aging until a later retry / higher priority and pushing more
> pressure onto other memcgs?

We also skip aging unconditionally at DEF_PRIORITY, both before and after
this patch. Scan budget can only be shifted to 0 when the memcg is smaller
than 8M. Seems trivial to me, maybe I can just restore below code in V3:

if (!nr_to_scan)
nr_to_scan = min(evictable, SWAP_CLUSTER_MAX);

https://lore.kernel.org/linux-mm/20260403-mglru-reclaim-v3-4-a285efd6ff91@xxxxxxxxxxx/

Was a bit worried that tiny cgroups could get over reclaimed, so maybe:

if (!nr_to_scan && sc->priority < DEF_PRIORITY)
nr_to_scan = min(evictable, SWAP_CLUSTER_MAX);

I tested the reclaim balancing issue using selftests/cgroup/test_memcontrol,
all looks good with whichever design as the cgroups there are >16M hence not
effected.

I think we might be over thinking about this :)