Re: WARNING in get_signal

From: Oleg Nesterov
Date: Tue Oct 06 2020 - 13:05:35 EST


On 10/05, Oleg Nesterov wrote:
>
> On 10/05, Oleg Nesterov wrote:
> >
> > > It looks like this code was introduced in commit 73ddff2bee15 ("job
> > > control: introduce JOBCTL_TRAP_STOP and use it for group stop trap").
> >
> > Yes, but I bet this was broken later, _may be_ by 924de3b8c9410c4.
>
> No, it seems this bug is really old. I'll try to make the fix tomorrow.

I still do not see a good fix. I am crying ;)

For the moment, lets forget about this problem. 924de3b8c9410c4 was wrong
anyway, task_join_group_stop() should be fixed:

- if current is traced, "jobctl & JOBCTL_STOP_PENDING" is not
enough, we need to check SIGNAL_STOP_STOPPED/group_stop_count

- if the new thread is traced, task_join_group_stop() should do
nothing, we should rely on ptrace_init_task()


Now lets return to this bug report. This (incomplete) test-case

void *tf(void *arg)
{
return NULL;
}

int main(void)
{
int pid = fork();
if (!pid) {
setpgrp();
kill(getpid(), SIGTSTP);

pthread_t th;
pthread_create(&th, NULL, tf, NULL);

return 0;
}

waitpid(pid, NULL, WSTOPPED);

ptrace(PTRACE_SEIZE, pid, 0, PTRACE_O_TRACECLONE);
waitpid(pid, NULL, 0);

ptrace(PTRACE_CONT, pid, 0,0);
waitpid(pid, NULL, 0);

int status;
int thr = waitpid(-1, &status, 0);
printf("pids: %d %d status: %x\n", pid, thr, status);

return 0;
}

triggers WARN_ON_ONCE(!signr) in do_jobctl_trap() and shows that the
auto-attached sub-thread reports the wrong status.

This patch

--- x/include/linux/ptrace.h
+++ x/include/linux/ptrace.h
@@ -218,7 +218,7 @@ static inline void ptrace_init_task(stru
__ptrace_link(child, current->parent, current->ptracer_cred);

if (child->ptrace & PT_SEIZED)
- task_set_jobctl_pending(child, JOBCTL_TRAP_STOP);
+ task_set_jobctl_pending(child, JOBCTL_TRAP_STOP|SIGTRAP);
else
sigaddset(&child->pending.signal, SIGSTOP);
}

should fix the problem, but it is not enough even if we forget about
task_join_group_stop().

- it is not clear to me if the new thread should join the group stop
after (say) PTRACE_CONT. If yes, it is not clear how can we do this.

- in any case it should stop after ptrace_detach(), but in this case
jobctl & JOBCTL_STOP_SIGMASK == SIGTRAP doesn't look right.

So perhaps we can change the patch above to use
current->jobctl & JOBCTL_STOP_SIGMASK instead of SIGTRAP ?

This too doesn't look good, the 1st ptrace_stop() should probably
always report SIGTRAP...

Oleg.