Re: ptrace group stop signal number not reset before PTRACE_INTERRUPT is delivered?

From: Oleg Nesterov
Date: Thu Aug 18 2016 - 21:07:14 EST


On 08/18, Oleg Nesterov wrote:
>
> On 08/17, Keno Fischer wrote:
> >
> > In this test case, the last
> > group-stop (after PTRACE_INTERRUPT) is delivered with a
> > WSTOPSIG(status) of SIGTTIN, which was the signr of the previous group
> > stop. From reading the man-page, I would have expected SIGTRAP.
>
> Me too ;)

Yes, but on the second though...

> > Now, I
> > understand that if there is another stop pending, PTRACE_INTERRUPT
> > will simply piggy-backs off that one, but I don't believe that is
> > happening in this case.
>
> Yes, thanks. This is wrong. We need to remove SIGTTIN from jobctl.
> The problem, I am not sure when... I'll try to think.

Probably not.

Let me try to clarify. It reports SIGTTIN because your test-case doesn't
send SIGCONT to the tracee before PTRACE_INTERRUPT. Because the child is
stopped-but-running.

IOW, after the tracer acks SIGSTOP

err = ptrace(PTRACE_CONT, child, NULL, (void*)SIGSTOP);

the child actually starts the group-stop and becomes "STOPPED", so that
if you do ptrace_detach() after that it will stop again in TASK_STOPPED
state even if it is no longer traced.

Then SIGTTIN changes the signal number in "jobctl & JOBCTL_STOP_SIGMASK",
the tracee is still "STOPPED" but now it looks as if it was stopped by
SIGTTIN, not by SIGSTOP. Hmm, not sure this is really good, but we can't
use signal->group_exit_code.

Finally, after PTRACE_INTERRUPT the tracee "returns" to the "STOPPED"
state and reports SIGTTIN, and I agree this looks confusing... But I'm
not sure we should/can change this, this behaviour probably makes sense
too.

Damn. I'll try to think more, but I simply can't decide what do we
actually want in this case.

Oleg.