Maybe a race window in cgroup.kill?

From: Tejun Heo
Date: Fri Jan 24 2025 - 16:33:18 EST


Hello, Christian.

I was looking at cgroup.kill implementation and wondering whether there
could be a race window. So, __cgroup_kill() does the following:

k1. Set CGRP_KILL.
k2. Iterate tasks and deliver SIGKILL.
k3. Clear CGRP_KILL.

The copy_process() does the following:

c1. Copy a bunch of stuff.
c2. Grab siglock.
c3. Check fatal_signal_pending().
c4. Commit to forking.
c5. Release siglock.
c6. Call cgroup_post_fork() which puts the task on the css_set and tests
CGRP_KILL.

The intention seems to be that either a forking task gets SIGKILL and
terminates on c3 or it sees CGRP_KILL on c6 and kills the child. However, I
don't see what guarantees that k3 can't happen before c6. ie. After a
forking task passes c5, k2 can take place and then before the forking task
reaches c6, k3 can happen. Then, nobody would send SIGKILL to the child.
What am I missing?

Thanks.

--
tejun