Re: [PATCH 1/2] mm/vmpressure: skip tree=true accounting on cgroup v2
From: Shakeel Butt
Date: Mon Jun 08 2026 - 13:06:58 EST
On Sat, Jun 06, 2026 at 04:41:33AM -0700, Usama Arif wrote:
> vmpressure() has two outputs gated by the @tree argument:
>
> @tree=false drives in-kernel socket pressure (mem_cgroup_set_
> socket_pressure), consumed by TCP/SCTP. This only
> applies on cgroup v2; on v1 socket memory is charged
> separately via tcpmem and the consumer reads
> memcg->tcpmem_pressure instead.
>
> @tree=true drives userspace eventfd notifications via the v1
> memory.pressure_level / cgroup.event_control interface.
> v2 has no equivalent: userspace gets reclaim signals
> through memory.pressure (PSI), which does not touch
> vmpressure.
>
> The existing early return covered v1 + @tree=false. The symmetric
> v2 + @tree=true case was falling through and doing the full lock /
> accumulate / schedule_work / parent-walk dance for an events list
> that can never be populated. bpftrace on a 176-core production host
> (cgroup v2, CONFIG_MEMCG_V1=n, 285 memcgs, sustained reclaim) showed
> ~16,200 @tree=true vmpressure() calls per minute. Add an early return
> that skips cgroup v2 + tree = true which avoids us doing all this work.
> On a v2-only host this also eliminates a lock contention path that can
> serialise reclaimers on a single global sr_lock.
>
> Signed-off-by: Usama Arif <usama.arif@xxxxxxxxx>
Acked-by: Shakeel Butt <shakeel.butt@xxxxxxxxx>