Re: [RFC][PATCH 2/2] exec: If possible don't wait for ptraced threads to be reaped

From: Eric W. Biederman
Date: Sun Apr 02 2017 - 14:58:34 EST


Oleg Nesterov <oleg@xxxxxxxxxx> writes:

> On 04/01, Eric W. Biederman wrote:
>>
>> --- a/fs/exec.c
>> +++ b/fs/exec.c
>> @@ -1052,6 +1052,7 @@ static int de_thread(struct task_struct *tsk)
>> struct signal_struct *sig = tsk->signal;
>> struct sighand_struct *oldsighand = tsk->sighand;
>> spinlock_t *lock = &oldsighand->siglock;
>> + bool may_hang;
>>
>> if (thread_group_empty(tsk))
>> goto no_thread_group;
>> @@ -1069,9 +1070,10 @@ static int de_thread(struct task_struct *tsk)
>> return -EAGAIN;
>> }
>>
>> + may_hang = atomic_read(&oldsighand->count) != 1;
>> sig->group_exit_task = tsk;
>> - sig->notify_count = zap_other_threads(tsk);
>> - if (!thread_group_leader(tsk))
>> + sig->notify_count = zap_other_threads(tsk, may_hang ? 1 : -1);
>
> Eric, this is amazing. So with this patch exec does different things depening
> on whether sighand is shared with another CLONE_SIGHAND task or not. To me
> this doesn't look sane in any case.

It is a 99% solution that makes it possible to talk about and review
letting the exec continue after the subthreads are killed but not
reaped.

Sigh I should have made may_hang say:

may_hang = (atomic_read(&oldsignand->count) != 1) && (sig->nr_threads > 1)

Which covers all know ways userspace actually uses these clone flags.

> And btw zap_other_threads(may_hang == 0) is racy. Either you need tasklist or
> exit_notify() should set tsk->exit_state under siglock, otherwise zap() can
> return the wrong count.

zap_other_thread(tsk, 0) only gets called in the case where we don't
care about the return value. It does not get called from fs/exec.c

> Finally. This patch creates the nice security hole. Let me modify my test-case
> again:
>
> void *thread(void *arg)
> {
> ptrace(PTRACE_TRACEME, 0,0,0);
> return NULL;
> }
>
> int main(void)
> {
> int pid = fork();
>
> if (!pid) {
> pthread_t pt;
> pthread_create(&pt, NULL, thread, NULL);
> pthread_join(pt, NULL);
> execlp(path-to-setuid-binary, args);
> }
>
> sleep(1);
>
> // Now we can send the signals to setiuid app
> kill(pid+1, ANYSIGNAL);
>
> return 0;
> }

That is a substantive objection, and something that definitely needs
to get fixed. Can you think of anything else?

Eric