Re: ptrace bug in -rc2+

From: Roland McGrath
Date: Tue Oct 26 2004 - 00:23:26 EST


Sorry it took a while for me to get back to you on this problem.

> The introduction of the new TASK_TRACED state in 2.6.9-rc2 changed the
> behavior of the kernel in a IMHO buggy way. Sending a SIGKILL to a
> process which is traced _and_ stopped doesn't work any more. user mode
> linux kernels do that on shutdown, thats why I ran into this.

This is a change that I explained when I posted the ptrace cleanup patches.
In general it is not safe to do any non-ptrace wakeup of a thread in
TASK_TRACED, because the waking thread could race with a ptrace call that
could be doing things like mucking directly with its kernel stack. AFAIK
noone has established that whatever clobberation ptrace can do to a running
thread is safe even if it will never return to user mode, so we can't allow
this even for SIGKILL.

What we can safely do is make a thread switching out of TASK_TRACED resume
rather than sitting in TASK_STOPPED if it has a pending SIGKILL or SIGCONT.
The following patch does this.

That doesn't make your test program happy. But it should be sufficient for
the shutdown case. When killing all processes, if the tracer gets killed
first, the tracee goes into TASK_STOPPED and will be woken and killed by
the SIGKILL (same as before). If the tracee gets killed first, it gets a
pending SIGKILL and doesn't wake up immediately--but, now, when the tracer
gets killed, the tracee will then wake up to die. You can observe the
change by kill -9'ing your test program and seeing that its child winds up
dead rather than stopped. This will also fix the (same) situations that
can arise now where you have used gdb (or whatever ptrace caller), killed
-9 the gdb and the process being debugged, but still have to kill -CONT the
process before it goes away (now it should just go away either the first
time or when you kill gdb).

Your particular test program is the one special case where we could make
the SIGKILL work immediately: the caller of kill is the ptracer, so we know
noone else can be using ptrace at the same time. But I am not in favor of
adding this special case. If you use ptrace yourself, you should cope.


Thanks,
Roland


Signed-off-by: Roland McGrath <roland@xxxxxxxxxx>

--- linux-2.6/kernel/ptrace.c 23 Oct 2004 17:07:11 -0000 1.39
+++ linux-2.6/kernel/ptrace.c 26 Oct 2004 04:13:16 -0000
@@ -38,6 +38,12 @@ void __ptrace_link(task_t *child, task_t
SET_LINKS(child);
}

+static inline int pending_resume_signal(struct sigpending *pending)
+{
+#define M(sig) (1UL << ((sig)-1))
+ return sigtestsetmask(&pending->signal, M(SIGCONT) | M(SIGKILL));
+}
+
/*
* unptrace a task: move it back to its original parent and
* remove it from the ptrace list.
@@ -61,8 +67,16 @@ void __ptrace_unlink(task_t *child)
* Turn a tracing stop into a normal stop now,
* since with no tracer there would be no way
* to wake it up with SIGCONT or SIGKILL.
+ * If there was a signal sent that would resume the child,
+ * but didn't because it was in TASK_TRACED, resume it now.
*/
+ spin_lock(&child->sighand->siglock);
child->state = TASK_STOPPED;
+ if (pending_resume_signal(&child->pending) ||
+ pending_resume_signal(&child->signal->shared_pending)) {
+ signal_wake_up(child, 1);
+ }
+ spin_unlock(&child->sighand->siglock);
}
}
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/