Re: [PATCH v2] cgroup: avoid css_set_lock in cgroup_css_set_fork()

From: Michal Koutný

Date: Tue Feb 10 2026 - 05:44:38 EST


Hello Mateusz.

On Thu, Jan 29, 2026 at 02:22:32PM +0100, Michal Koutný <mkoutny@xxxxxxxx> wrote:
> And I'm wondering whether removal only in cgroup_css_set_fork() improves
> parallelism because the tasks (before patching) are queued on the first
> css_set_lock, serialized through the first critical section and when
> they arrive to the second critical section in cgroup_post_fork() their
> arrival rate is already reduced because they had to pass through the
> first critical section. Hence the 2nd pass through the critical section
> should be less contended (w/out waiting).

I was still curious about this, so I tried own measurement.
I ran your clone'ing will-it-scale testcase [1].
Basically it was
clone_processes -s 1000 -t 40
on a 40 CPUs/80 SMTs machine.
I watched for the `total:` iteration counts reported by wis
periodically.

6.18.8-0-default (baseline := stable + pidmap patches [2][3])
2.9383e+05 ± 1135.5

6.18.8-1.g886f4c4-default (baseline + rwlock impl (previous message))
2.9363e+05 ± 1219.8

6.18.8-1.gb21e8f8-default (baseline + seqcount impl (your patch))
2.9147e+05 ± 1125.6

So I could not reproduce any non-random change with this css_set_lock
split (I consider even the apparent difference between implementations
rather random).

At this point, I should look into profiles whether the bottleneck is
really css_set_lock in cgroup_post_fork() but I'm sharing what I have,
glad for your possible insights.

Regards,
Michal

[1] Only clone_process variant, clone_threads randomly hung.
will-it-scale/glibc (2.42-3.1) likely doesn't work well with the
cancellation/(no) join (but I got hangs even with pthread cleanup
handlers that joined the child thread)

#0 futex_wait (futex_word=0x7ffff7ffd840 <_rtld_local+2112>, expected=2, private=0) at ../sysdeps/nptl/futex-internal.h:146
#1 __GI___lll_lock_wait_private (futex=0x7ffff7ffd840 <_rtld_local+2112>) at lowlevellock.c:34
#2 0x00007ffff7c98d69 in __GI___nptl_deallocate_stack (pd=0x7ffff7ab16c0) at nptl-stack.c:113
...
#5 0x00000000004029ca in kill_tasks () at main.c:151

[2] https://lore.kernel.org/linux-mm/20251206131955.780557-1-mjguzik@xxxxxxxxx/
[3] Those patched improved the metric about some 10% (but I haven't
measured this difference so thoroughly).

Attachment: signature.asc
Description: PGP signature