Re: [PATCH] mm: switch deferred split shrinker to list_lru
From: Johannes Weiner
Date: Thu Mar 12 2026 - 10:32:23 EST
On Thu, Mar 12, 2026 at 09:23:35AM +1100, Dave Chinner wrote:
> On Wed, Mar 11, 2026 at 11:43:58AM -0400, Johannes Weiner wrote:
> > The deferred split queue handles cgroups in a suboptimal fashion. The
> > queue is per-NUMA node or per-cgroup, not the intersection. That means
> > on a cgrouped system, a node-restricted allocation entering reclaim
> > can end up splitting large pages on other nodes:
> >
> > alloc/unmap
> > deferred_split_folio()
> > list_add_tail(memcg->split_queue)
> > set_shrinker_bit(memcg, node, deferred_shrinker_id)
> >
> > for_each_zone_zonelist_nodemask(restricted_nodes)
> > mem_cgroup_iter()
> > shrink_slab(node, memcg)
> > shrink_slab_memcg(node, memcg)
> > if test_shrinker_bit(memcg, node, deferred_shrinker_id)
> > deferred_split_scan()
> > walks memcg->split_queue
> >
> > The shrinker bit adds an imperfect guard rail. As soon as the cgroup
> > has a single large page on the node of interest, all large pages owned
> > by that memcg, including those on other nodes, will be split.
> >
> > list_lru properly sets up per-node, per-cgroup lists. As a bonus, it
> > streamlines a lot of the list operations and reclaim walks. It's used
> > widely by other major shrinkers already. Convert the deferred split
> > queue as well.
> >
> > The list_lru per-memcg heads are instantiated on demand when the first
> > object of interest is allocated for a cgroup, by calling
> > memcg_list_lru_alloc(). Add calls to where splittable pages are
> > created: anon faults, swapin faults, khugepaged collapse.
> >
> > These calls create all possible node heads for the cgroup at once, so
> > the migration code (between nodes) doesn't need any special care.
> >
> > The folio_test_partially_mapped() state is currently protected and
> > serialized wrt LRU state by the deferred split queue lock. To
> > facilitate the transition, add helpers to the list_lru API to allow
> > caller-side locking.
> >
> > Signed-off-by: Johannes Weiner <hannes@xxxxxxxxxxx>
> > ---
> > include/linux/huge_mm.h | 6 +-
> > include/linux/list_lru.h | 48 ++++++
> > include/linux/memcontrol.h | 4 -
> > include/linux/mmzone.h | 12 --
> > mm/huge_memory.c | 326 +++++++++++--------------------------
> > mm/internal.h | 2 +-
> > mm/khugepaged.c | 7 +
> > mm/list_lru.c | 197 ++++++++++++++--------
> > mm/memcontrol.c | 12 +-
> > mm/memory.c | 52 +++---
> > mm/mm_init.c | 14 --
> > 11 files changed, 310 insertions(+), 370 deletions(-)
>
> Can you please split this up into multiple patches (i.e. one logical
> change per patch) to make it easier to review?
No problem, I'll do that and send out a v2.
The list_lru changes started as only the locking functions, then
things kept creeping in...
Thanks