Re: [PATCHSET RFC] ptrace,signal: clean transition between STOPPEDand TRACED

From: Tejun Heo
Date: Thu Jan 27 2011 - 08:23:25 EST


Hello, sorry about the delay.

On Mon, Jan 17, 2011 at 06:11:33PM -0800, Roland McGrath wrote:
> > 1. When attaching to a STOPPED task or a traced task stops for group
> > stop, the tracee now enters TRACED instead of STOPPED. This is
> > visible via fs/proc but, more importantly, SIGCONT is ignored if a
> > task is TRACED.
>
> That is probably OK, but I'm still not entirely sure about it.
>
> > This may, for example, affect the operation of strace but given how
> > strace always need to issue further ptrace operations on trap to
> > determine what's going on, I doubt it would actually be worse.
>
> I'm not clear on what effect on strace you have in mind.

I was trying to imagine a case where this could cause a problem. If
there is a program which PTRACE_ATTACH's and then immediately follows
with SIGCONT and expects it to be processed, the end result wouldn't
be what it expects, but I don't think this is an actual problem we
need to worry about.

> > 2. The transition between STOPPED and TRACED involves a short window
> > of RUNNING inbetween. On attach, the transition is hidden from the
> > tracer using GROUP_STOP_TRAPPING but it still is visible to other
> > threads in the tracer's group. IOW, if another thread performs
> > WNOHANG wait(2) on the tracee while attach is in progress, the
> > wait(2) may fail even if the tracee is known to be in stopped state
> > before.
> >
> > The same problem exists the other direction during detach.
> > Currently, the code doesn't try to hide this transition even from
> > the tracer. IOW, if the tracer attaches to a stopped task,
> > detaches, reattaches and then performs WNOHANG wait(2), the wait(2)
> > may fail. However, given the previous behavior where the tracee is
> > always woken up by wake_up_process() on detach, this is highly
> > unlikely to cause any problem.
>
> This seems more problematic to me. I don't like that start/stop window
> at all.

Which case are you worried about? Another thread doing WNOHANG
wait(2) or the same ptracer trying to re-attach immediately after
detaching? Or both?

> Saying "wait may fail" is not sufficiently precise to be helpful. Please
> be more clear. If "fail" means ECHILD, that is unacceptable. If "fail"
> means a WNOHANG wait returns 0 when userland already "knows" that the
> thread is topped, that might be more acceptable.

It's the latter. The only thing which changes is that the task might
not be in the exact expected state for brief amount of time.

For the initial STOPPED -> TRACED transition, the race window doesn't
exist for the ptracer itself. It's only visible if someone else than
the ptrace does the wait(2) which is a pretty convoluted use case to
begin with.

For TRACED -> STOPPED -> TRACED transition (attach right after
detach), it is visible to the ptracer but again I don't think this is
even remotely reasonable use case. Plus, it never worked. We've been
issuing SIGCONT unconditionally on TRACED -> STOPPED anyway.

Thank you.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/