Re: [PATCH 10/11] ptrace: move JOBCTL_TRAPPING wait to wait(2) andptrace_check_attach()

From: Oleg Nesterov
Date: Thu May 12 2011 - 14:21:24 EST


On 05/12, Tejun Heo wrote:
>
> On Thu, May 12, 2011 at 05:59:10PM +0200, Oleg Nesterov wrote:
>
> > Also. _Perhaps_ we can rethink the SIGCONT trapping, and perhaps in
> > this case do_wait() won't need any changes. May be.
>
> But, if there's a better way, sure.

Unfortunately, I have no any particular ideas right now, and I doubt
I can invent something clean. But I'd like at least to try to think
a bit.


Now about GROUP_STOP_TRAPPING. To simplify the discussion, lets
forget about this series, please recall the previous
"ptrace: use GROUP_STOP_TRAPPING for PTRACE_DETACH too" change.

http://marc.info/?l=linux-kernel&m=130486589601593

In short, it changes __ptrace_unlink() to set _TRAPPING if needed,
and ptrace_attach() waits for !TRAPPING unconditionally.

Problem: TRAPPING can be set outside of do_signal_stop() paths, and
I think we should avoid this as much as possible.

(I am ignoring the problem this patch addresses temporary, I think
we can fix it a bit differently).

As we already discussed, this patch is not right, we have the problems
with KILL/CONT. The proposed solution is to clear TRAPPING on kill,
but I think this is not enough.

One particular example. Note that de_thread() waits ->notify_count == 0
in TASK_UNINTERRUPTIBLE. Btw, this is not good, we need TASK_KILLABLE,
but this doesn't matter in this discussion. The only imporant thing is
that it is practically impossible to make this path restartable.

Note also we have PT_TRACE_EXIT, the sub-thread stops even if killed by
the execing thread. (to clarify, this depends on /dev/random and should
be fixed, and in fact it is debatable whether it should stop, please
ignore).

Now suppose we are tracing 2 threads, T1 execs and kills T2, T2 reports
PTRACE_EVENT_EXIT. Now, if the tracer waits for !(T1 & TRAPPING), it will
wait forever.


Say, the thread group was stopped, the tracer PTRACE_CONT's T1, it calls
sys_execve() and reports the trap from syscall_trace_enter().

The tracer does ptrace(T1, DETACH) + ptrace(T1, SEIZE) and hangs forever.


Once again, this is only one example. coredump, vfork, probably something
else. In short: I think that TRAPPING-outside-of-do_signal_stop is the
can of worms.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/