Re: [PATCH V2 1/2] exec: don't wait for zombie threads with cred_guard_mutex held

From: Eric W. Biederman
Date: Tue Feb 21 2017 - 15:25:11 EST

Oleg Nesterov <oleg@xxxxxxxxxx> writes:

> On 02/21, Eric W. Biederman wrote:
>> Today cred_guard_mutex is part of making exec appear to be an atomic
>> operation to ptrace and and proc. To make exec appear to be atomic
>> we do need to take the mutex at the beginning and release it at the end
>> of exec.
>> The semantics of exec appear atomic to ptrace_attach and to proc readers
>> are necessary to ensure we use the proper process credentials in the
>> event of a suid exec.
> This is clear. My point is that imo a) it is over-used in fs/proc and b)
> the scope of this mutex if execve is too huge. I see absolutely no reason
> to do copy_strings() with this mutex held, for example. And note that
> copy_strings() can use a lot of memory/time, it can trigger oom,swapping,
> etc.

I agree that we can do things like copy_strings that don't change the
data structures and of the task without before taking the cred_guard_mutex.

> But let me repeat, this is a bit off-topic right now, this patch doesn't
> change anything in this respect, afaics.
>> I believe making cred_guard_mutex per task is an option. Reducing the
>> scope of cred_guard_mutex concerns me. There appear to be some fields
>> like sighand that we currently expose in proc
> please see another email, collect_sigign_sigcatch() is called without this
> mutex.

I agree that it is called without the mutex. It is not clear to me that
is the correct behavior. It violates the fundamental property that
exec of a setuid executable should be an atomic operation. I don't know
how much we care but it disturbs me that we can read something of a
processes signal handling state with the wrong credentials.

Adopting an implementation where we can never fix this apparent bug
really really disturbs me.

>> Do you know if we can make cred_guard_mutex a per-task lock again?
> I think we can, but this needs some (afaics simple) changes too.
> But for what? Note that the problem fixed by this series won't go away
> if we do this.

I believe it will if the other waiters use mutex_lock_killable.

> So what do you think about this series?

I like the second patch. That seems clean and reasonable.

I really don't like the first patch. It makes an information leak part
a required detail of the implementation and as such possibly something
we can never change. It attempts to paint a picture for a full fix in
the future that appears to result in an incorrect kernel. That really
bugs me.

I suspect that a good fix that respects that proc and ptrace_attach need
to exclude the setuid exec case for semantic reasons would have a similar

I think a mutex doing the job that cred_guard_mutex is doing especially
when we have multiple readers and a single writer is the wrong locking
primative. A reader-writer lock or something even cheaper would
probably be much better.

I think fixing the deadlock is important.

I think structuring the fix in such a way that the code is easily
maintainable in the future and is also very important.

Right now it feels like your fix in patch 1 makes things a bit more
brittle and I don't like that at all.