Re: [PATCH] copy_process(): Move fd_install() out of sighand->siglock critical section

From: Waiman Long
Date: Wed Feb 09 2022 - 11:25:33 EST


On 2/8/22 16:59, Eric W. Biederman wrote:
Al Viro <viro@xxxxxxxxxxxxxxxxxx> writes:

On Tue, Feb 08, 2022 at 01:51:35PM -0500, Waiman Long wrote:
On 2/8/22 13:16, Al Viro wrote:
On Tue, Feb 08, 2022 at 11:39:12AM -0500, Waiman Long wrote:

One way to solve this problem is to move the fd_install() call out of
the sighand->siglock critical section.

Before commit 6fd2fe494b17 ("copy_process(): don't use ksys_close()
on cleanups"), the pidfd installation was done without holding both
the task_list lock and the sighand->siglock. Obviously, holding these
two locks are not really needed to protect the fd_install() call.
So move the fd_install() call down to after the releases of both locks.
Umm... That assumes we can delay it that far. IOW, that nothing
relies upon having pidfd observable in /proc/*/fd as soon as the child
becomes visible there in the first place.

What warranties are expected from CLONE_PIDFD wrt observation of
child's descriptor table?

I think the fd_install() call can be moved after the release of
sighand->siglock but before the release the tasklist_lock. Will that be good
enough?
Looks like it should, but I'd rather hear from the CLONE_PIDFD authors first...
Christian, could you comment on that?
The tasklist_lock and the siglock provide no protection against
being looked up in proc.

The proc filesystem looks up process information with things only
protected by the rcu_read_lock(). Which means that the process
will be visible through proc after "attach_pid(p, PIDTYPE_PID".

The fd is being installed in the fdtable of the parent process,
and the siglock and tasklist_lock are held to protect the child.


Further fd_install is exposing the fd to userspace where it can be used
by the process_madvise and the process_mrelease system calls, from
anything that shares the fdtable of the parent thread. Which means it
needs to be guaranteed that kernel_clone will call wake_up_process
before it is safe to call fd_install.


So it appears to me that moving fd_install earlier fundamentally unsafe,
and the locks are meaningless from an fd_install perspective.

Which means it should be perfectly fine to move the fd_install outside
of the tasklist_lock and the outside siglock.


I don't see how we could support the fd appearing in the fdtable sooner
which seems to make the question moot as to weather userspace in some
odd corner case expects the fd to appear in the fdtable sooner.

So I say move fd_install down with proc_fork_connector and friends.

Right. Keeping fd_install() inside of tasklist_lock may also be problematic as a read lock can be taken at interrupt context which may cause similar lockdep splat. So I am keep this patch as is.

Cheers,
Longman