Re: [LSF/MM/BPF TOPIC] Reimagining Memory Cgroup (memcg_ext)

From: Shakeel Butt

Date: Wed Mar 11 2026 - 16:41:10 EST


On Wed, Mar 11, 2026 at 03:19:31PM +0800, Muchun Song wrote:
>
>
> > On Mar 8, 2026, at 02:24, Shakeel Butt <shakeel.butt@xxxxxxxxx> wrote:
> >

[...]

> >
> > Per-Memcg Background Reclaim
> >
> > In the new memcg world, with the goal of (mostly) eliminating direct synchronous
> > reclaim for limit enforcement, provide per-memcg background reclaimers which can
> > scale across CPUs with the allocation rate.
>
> Hi Shakeel,
>
> I'm quite interested in this. Internally, we privately maintain a set
> of code to implement asynchronous reclamation, but we're also trying to
> discard these private codes as much as possible. Therefore, we want to
> implement a similar asynchronous reclamation mechanism in user space
> through the memory.reclaim mechanism. However, currently there's a lack
> of suitable policy notification mechanisms to trigger user threads to
> proactively reclaim in advance.

Cool, can you please share what "suitable policy notification mechanisms" you
need for your use-case? This will give me more data on the comparison between
memory.reclaim and the proposed approach.


>
> >
> > Lock-Aware Throttling
> >
> > The ability to avoid throttling an allocating task that is holding locks, to
> > prevent priority inversion. In Meta's fleet, we have observed lock holders stuck
> > in memcg reclaim, blocking all waiters regardless of their priority or
> > criticality.
>
> This is a real problem we encountered, especially with the jbd handler
> resources of the ext4 file system. Our current attempt is to defer
> memory reclamation until returning to user space, in order to solve
> various priority inversion issues caused by the jbd handler. Therefore,
> I would be interested to discuss this topic.

Awesome, do you use memory.max and memory.high both and defer the reclaim for
both? Are you deferring all the reclaims or just the ones where the charging
process has the lock? (I need to look what jbd handler is).