PID namespace init releases its file locks before its children die
From: Demi Marie Obenour
Date: Thu Oct 02 2025 - 14:22:48 EST
I noticed that PID 1 in a PID namespace can release file locks (due
to exiting) while its children are still running for a bit. If the
locks held by PID 1 were relied to serialize the execution of its
child processes, this could result in data corruption.
Specifically, the child processes are killed via exit_notify() ->
forget_original_parent() -> find_child_reaper() ->
zap_pid_ns_processes(). That comes *after* exit_files(), which
releases the file locks.
While it is possible to implement this with cgroups, cgroups
are quite a bit more complicated to use, at least compared to
a single call to unshare() before fork().
Is this intentional? Changing the behavior would make supervision
trees significantly easier to properly implement.
--
Sincerely,
Demi Marie Obenour (she/her/hers)
Attachment:
OpenPGP_0xB288B55FFF9C22C1.asc
Description: OpenPGP public key
Attachment:
OpenPGP_signature.asc
Description: OpenPGP digital signature