Re: [RFC][PATCH] exec: Don't wait for ptraced threads to be reaped.

From: Oleg Nesterov
Date: Sat Mar 04 2017 - 12:05:30 EST

Next message: Oleg Nesterov: "Re: [PATCH 0/2] fix the traced mt-exec deadlock"
Previous message: Tony Lindgren: "Re: [PATCH v3 1/2] ARM: OMAP2+: Fix device node reference counts"
In reply to: Eric W. Biederman: "[RFC][PATCH] exec: Don't wait for ptraced threads to be reaped."
Next in thread: Oleg Nesterov: "Re: [PATCH 0/2] fix the traced mt-exec deadlock"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 03/03, Eric W. Biederman wrote:
>
> Ever since CLONE_THREAD support was added to the kernel it has been
> possible to dead-lock userspace by ptracing a process and not reaping
> it's child threads.

Hmm. I disagree... I do not think this is a bug. But lets discuss this
separately, perhaps I misunderstood you.

> With use of the cred_guard_mutex in proc the ways
> userspace can unknowningly trigger a dead-lock have grown.

I think this particular problem did not exist until cred_guard_mutex
was introduced. Debugger can obviously "delay" exec if it doesn't
reap a zombie sub-thread, but this is another thing and not a bug imo.

> Sovle this by modifying exec to only wait until all of the other
> threads are zombies, and not waiting until the other threads
> are reaped.

This patch looks wrong in many ways.

> @@ -1065,11 +1065,8 @@ static int de_thread(struct task_struct *tsk)
> }
>
> sig->group_exit_task = tsk;
> - sig->notify_count = zap_other_threads(tsk);
> - if (!thread_group_leader(tsk))
> - sig->notify_count--;
> -
> - while (sig->notify_count) {
> + zap_other_threads(tsk);
> + while (atomic_read(&sig->live) > 1) {
> __set_current_state(TASK_KILLABLE);
> spin_unlock_irq(lock);
> schedule();

Very nice. So de_thread() returns as soon as all other threads decrement
signal->live in do_exit(). Before they do, say, exit_mm(). This is already
wrong, for example this breaks OOM. Plus a lot more problems afaics, but
lets ignore this.

Note that de_thread() also unshares ->sighand before return. So in the
case of mt exec it will likely see oldsighand->count != 1 and alloc the
new sighand_struct and this breaks the locking.

Because the execing thread will use newsighand->siglock to protect its
signal_struct while the zombie threads will use oldsighand->siglock to
protect the same signal struct. Yes, tasklist_lock + the fact irq_disable
implies rcu_lock mostly save us but not entirely, say, a foreign process
doing __send_signal() can take the right or the wrong lock depending on
/dev/random.

> @@ -818,6 +808,8 @@ void __noreturn do_exit(long code)
> if (tsk->mm)
> setmax_mm_hiwater_rss(&tsk->signal->maxrss, tsk->mm);
> }
> + if ((group_left == 1) && tsk->signal->group_exit_task)
> + wake_up_process(tsk->signal->group_exit_task);

This is racy, but this is minor.

Oleg.

Next message: Oleg Nesterov: "Re: [PATCH 0/2] fix the traced mt-exec deadlock"
Previous message: Tony Lindgren: "Re: [PATCH v3 1/2] ARM: OMAP2+: Fix device node reference counts"
In reply to: Eric W. Biederman: "[RFC][PATCH] exec: Don't wait for ptraced threads to be reaped."
Next in thread: Oleg Nesterov: "Re: [PATCH 0/2] fix the traced mt-exec deadlock"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]