Re: A peculiarity in ptrace/waitpid behavior
From: Oleg Nesterov
Date: Fri Mar 20 2015 - 12:27:52 EST
Hi Pavel,
let me add lkml, we should not discuss this offlist.
On 03/20, Pavel Labath wrote:
>
> 1) we get a waitpid() notification that the tracee got SIGUSR1
> 2) we do a ptrace(GETSIGINFO) to get more info
> 3) eventually we decide to restart the tracee with PTRACE_CONT, passing it
> SIGUSR1
> 4) immediately after that we get another waitpid notification, again with
> SIGUSR1, even though the thread had received no additional signals
> 5) we again try to a GETSIGINFO, however this time it fails with ESRCH.
> Therefore, we assume that the thread has died
I found a similar bug by code inspection some time ago. I even have
a fix, but I need to think more... And I even wrote the test-case ;)
see below.
But so far I can't say if you hit the same problem or not. If you can
reproduce the problem, perhaps I can send you debugging patch?
Oleg.
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
#include <sys/ptrace.h>
#include <sys/syscall.h>
#include <assert.h>
#define tkill(pid, sig) \
syscall(__NR_tkill, pid, sig)
void run_test(void)
{
int pid, stat;
pid = fork();
if (!pid) {
assert(ptrace(PTRACE_TRACEME, 0,0,0) == 0);
raise(SIGSTOP);
assert(0);
}
assert(pid == wait(&stat) && stat == 0x137f);
tkill(pid, SIGTRAP); /* should not be reported */
tkill(pid, SIGKILL);
assert(pid == wait(&stat));
if (stat == 0x9)
return;
printf("unexpected wait: stat=%x\n", stat);
kill(0, SIGKILL);
}
int main(void)
{
int i = 8; /* random */
while (--i)
if (!fork())
break;
for (;;)
run_test();
return 0;
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/