Re: waitpid and strace -f

From: Mike Coleman (mkc@kc.net)
Date: Mon Jan 31 2000 - 19:31:36 EST


Andreas Schwab <schwab@suse.de> writes:
> Mike Coleman <mkc@kc.net> writes:
>
> |> "Alexander V. Lukyanov" <lav@long.yar.ru> writes:
> |> > The following program receives ECHILD for waitpid when it is run with
> |> > `strace -f'. WNOHANG and `-f' are required. I'm not sure if it is
> |> > kernel or strace problem.
> |>
> |> AFAIK, this is correct behavior. As soon as the child runs, it exits, so
> |> depending on the order that the parent and child run in, the child may already
> |> be gone by the time the waitpid takes place. Thus the ECHILD.
>
> This is wrong. The child will not be gone (but kept around as zombie) as
> long as nobody has waited for it. So the waitpid must never return ECHILD
> for the first call (assuming the pid is valid).

You're right--I goofed. Allow me to repent by giving the *real* answer (now
that my credibility is shot :-).

When you use 'strace -f' to follow these two processes, strace attaches itself
(with ptrace) to the child, which changes its parent (for the purposes of
wait) to be the strace process itself. Thus the wait call returns ECHILD,
since the child in question has been "kidnapped" by strace.

There are difficulties with trying to implement the correct wait4 behavior
under ptrace. strace goes to some effort to do the right thing, but this case
isn't covered.

More to follow,
--Mike

-- 
Any sufficiently adverse technology is indistinguishable from Microsoft.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Feb 07 2000 - 21:00:05 EST