Re: [PATCH V2 1/2] exec: don't wait for zombie threads with cred_guard_mutex held

From: Oleg Nesterov
Date: Mon Feb 20 2017 - 10:25:08 EST


Thanks for looking into this! and sorry for delay.

On 02/17, Eric W. Biederman wrote:
> Oleg Nesterov <oleg@xxxxxxxxxx> writes:
> > - In any case we should limit the scope of cred_guard_mutex in execve paths.
> > It is not clear why do we take it at the start of execve, and worse, it is
> > not clear why we do we actually overload this mutex to exclude other threads
> > (except check_unsafe_exec() but this is solveable). The original motivation
> > was signal->in_exec_mm protection but this idea was replaced by 3c77f8457221
> > ("exec: make argv/envp memory visible to oom-killer"). It is just ugly to
> > call copy_strings/etc with this mutex held.
> The original changes that introduced cred_guard_mutex are:
> a6f76f23d297 ("CRED: Make execve() take advantage of copy-on-write credentials")
> d84f4f992cbd ("CRED: Inaugurate COW credentials")
> So I don't think you actually have your history right.
> Beyond that there is a compelling reason to have exec appear atomic from
> the perspective of ptrace_attach. If the operation is a setuid exec
> and the tracer does not have permission to trace the original or the
> result of the exec there could be some significant information leakage
> if the exec operation is not atomic from the perspective of
> ptrace_attach.

Yes sure.

But I meant execve() should not take cred_guard_mutex at the start, it
should take it later even if we do not rework the security hooks. At least
it should take it after copy_strings(), but probably this needs some work.

> Additionally your comment makes me nervous when you are wondering why we
> take this mutex to exclude other threads and I look in the git history
> and see:
> commit 9b1bf12d5d51bca178dea21b04a0805e29d60cf1
> Author: KOSAKI Motohiro <kosaki.motohiro@xxxxxxxxxxxxxx>
> Date: Wed Oct 27 15:34:08 2010 -0700
> signals: move cred_guard_mutex from task_struct to signal_struct
> Oleg Nesterov pointed out we have to prevent multiple-threads-inside-exec
> itself and we can reuse ->cred_guard_mutex for it. Yes, concurrent
> execve() has no worth.
> Let's move ->cred_guard_mutex from task_struct to signal_struct. It
> naturally prevent multiple-threads-inside-exec.

Yes, and let me explain the original motivation for this change.

To remind, we had a problem with copy_strings() which can use a lot of
memory, and this memory was not visible to OOM-killer.

So we were going to add the new member,

signal_struct->in_exec_mm = bprm->mm

and change OOM-killer to account both task->mm and task->signal->in_exec_mm.

And in this case we obviously need to ensure that only one thread
can enter exec and use signal_struct->in_exec_mm.

That patch was ready, but then we found another (better) solution:
3c77f8457221 ("exec: make argv/envp memory visible to oom-killer").

So I do not think we need to exclude other threads today, and we do
not need to hold cred_guard_mutex throughout the whole execve path.

Again, this needs some work. For example check_unsafe_exec() assumes
it can't race with another thread, see 9e00cdb091b008cb3c78192651180
"exec:check_unsafe_exec: kill the dead -EAGAIN and clear_in_exec logic".
But this looks solvable.

> So while I fully agree we have issues here that we need to address and
> fix your patch description does not inspire confidence.

See above... what do you think I should change in this part of changelog?