Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang afterPTRACE_ATTACH

From: Tejun Heo
Date: Mon Feb 21 2011 - 10:16:37 EST


Hello,

On Mon, Feb 14, 2011 at 06:54:37PM +0100, Denys Vlasenko wrote:
> > Okay, maybe I'm missing something but so once SIGSTOP is determined to
> > be delivered, then the tracee enters group stop and that's the second
> > SIGSTOP notification you get.  At that point, strace should wait for
> > the tracee to be continued by SIGCONT.  That should work, right?
>
> Do you mean "Will it work on current kernels" or "that's what strace
> has to do and then it is supposed to work correctly, modulo bugs"?

Yes and no, I think it will mostly work on current kernels if we
concentrate only on the actual stopping and continuing part; however,
there still are two obstacles.

1. The distinction between the first SIGSTOP trapping and the second
can only be reliably done by GETSIGINFO which in turn will put the
tracee into TASK_TRACED making the tracee ignore the future SIGCONT
and the tracer has no way to detect reception of it either. The
tracer can make the distinction by looking at the sequence of
events but it wouldn't work for multithreaded cases and right after
attach.

2. Due to reparenting, wait(2) notifications (including the SIGCLDs)
don't get to the real parent at all.

#2 just needs fixing. I don't think there will be a lot of different
opinions on that one; however, #1 is trickier and one of the biggest
reasons why we have this long thread.

> In this particular scenario, first SIGSTOP is ptrace-stop.
> Obviously, we must issue ptrace(PTRACE_SYSCALL, $PID, 0x1, SIGSTOP)
> to continue.
>
> Second SIGSTOP is notification of tracee's group-stop to debugger.

So, at this point, the debugger shouldn't be continuing the tracee by
calling PTRACE_SYSCALL but do something else. What that should be is
still being discussed.

> The question is, logically, by sending this notification, does tracee,
> or does it not enter into ptrace-stop too? (IOW: is ptrace-stop a separate
> bit in task state, independent of group-stop?)
> If yes, then we need to release tracee from ptrace-stop (but it will remain in
> group-stop) by issuing ptrace(PTRACE_SYSCALL, $PID, 0x1, 0).
> If not, then we must not do so, because the task is not ptrace-stopped,
> and ptrace(PTRACE_SYSCALL, $PID, 0x1, 0) is undefined (I think it should
> error out to indicate that).

That preciesly is what is being discussed. IIUC, Oleg and Roland are
saying that the tracee should enter group stop but not ptrace trap at
that point and then transition into ptrace trap on the first PTRACE
call. I was agreeing with that at first but changed my mind after
reading these discussions and now I think we should just put it in
ptrace trap and give the debugger a way to notice the end of group
stop.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/