Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang afterPTRACE_ATTACH
From: Oleg Nesterov
Date: Fri Feb 18 2011 - 14:45:31 EST
Hello Tejun,
On 02/18, Tejun Heo wrote:
> Hello, Oleg.
>
> Still trying to follow the new discussion.
And how it goes?
As for me, I am not sure I can follow it ;)
> On Tue, Feb 15, 2011 at 09:27:47PM +0100, Oleg Nesterov wrote:
> > > The reason for the transition to TASK_TRACED is to prevent a race with
> > > SIGCONT waking the task. There is always a race with SIGKILL waking it,
> > > but the circumstances where that can really matter are far fewer.
> > > You need to make sure that the work PTRACE_GETSIGINFO does to access
> > > last_siginfo cannot race with that pointer disappearing or the stack
> > > space it points to becoming invalid. I think the use of siglock ensures
> > > that, but Oleg should verify it.
> >
> > Yes, I think this is safe.
> >
> > I do not really like this idea because it looks a bit strange to treat
> > PTRACE_GETSIGINFO specially, and this doesn't solve all problems. And,
> > once again, I still hope we can change ptrace_resume() so that it doesn't
> > wakeup the stopped (I mean, SIGNAL_STOP_STOPPED) tracee, in this case this
> > hack is not needed.
> >
> > And. We are going to add the new requests which doesn't need the stopped
> > tracee anyway. So we can just add PTRACE_HAS_SIGINFO which returns
> > child->last_siginfo != NULL. This looks simpler, and this is compatible.
> > Of course this check is racy, but this doesn't matter. PTRACE_GETSIGINFO
> > is equally racy if it doesn't change the state to TASK_TRACED.
>
> This is probably where we disagree the most but I think the weird part
> isn't making PTRACE_GETSIGINFO exempt from TASK_TRACE transition. The
> weirdness starts when the tracee is put into TASK_STOPPED while being
> ptraced. I think such dual modes of operation inherently lead to
> strange problems.
>
> Instead of having simple "a ptracer stops in TASK_TRACED and its
> execution is under the control of ptrace",
In fact, I am not sure I really disagree with this part, but see below.
> The patch which puts the tracee into TASK_TRACED
> on ATTACH already fix two problems discussed in this thread without
> doing anything wonky. I think it says a lot.
Yes. One off-topice note... if we are talking about this patch only,
I do not think it makes sense to add the new member into task_struct
so that STOPPED/TRACED transition can always report the exactly correct
->exit_code. I think we can just use group_exit_code ?: SIGSTOP.
But, again, this is off-topic.
> As it currently stands, SIGSTOP/CONT while ptraced doesn't work
And this is probably where we disagree the most. I think this is bug,
and this should be fixed.
> and
> even if we bend the rules subtly and provide sneaky ways like the
> above, userland needs to be modified to make use of it anyway.
Yes. But with the current code we can't modify, say, strace so
that SIGSTOP/CONT can work "correctly".
> I
> think it would be far cleaner to simply make ptracee always stop in
> TASK_TRACED and give the ptracer a way to notice what's happening to
> the tracee
Well. If we accept the proposed PTRACE_CONT-needs-SIGCONT behaviour,
then I think this probably makes sense. The tracee stops under ptrace,
the possible SIGCONT shouldn't abuse debugger which wants to know, say,
the state of registers.
To be honest, I don't understand whether I changed my mind now, or
I was never against this particular change in behaviour.
Once debugger does PTRACE_CONT, the tracee becomes TASK_STOPPED and
now it is "visible" to SIGCONT (or the tracee resumes if SIGCONT has
come in between).
But I think you will equally blame this TRACED/STOPPED transition
as "behavioral subtleties" and I can understand you even if I disagree.
And yes, this leads to other questions. But note that this greatly
simplifies things. The tracee can never participate in the same
group-stop twice.
Oleg.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/