Time window between process dying and its proc directory going way

From: Pengcheng
Date: Fri Apr 08 2011 - 23:05:29 EST

Dear folks,

I'm experimenting 2.6.3x kernel on some network test boxes. Under very
heavy network traffic, the proc directory of a process *seems* not
going away within a short time window after the process dies.

In our system, we have a daemon launching many processes, a watchdog
monitoring heartbeats of each process. If a process stopped
heartbeat for a while, the daemon will get a request from watchdog to
kill the process and restart it.

To kill a process, we send a SIGQUIT for it to exit gracefully. If it
is still around after 10 seconds, we send a SIGKILL. We waitpid(-1,
WNOHANG) before trying to start a new process.

We found a problem in our testing: after a process was killed, its
/proc/<pid> directory (*we suspect*) might still be there for a while
(at least 4 seconds) and we could still read files in it.

Any thoughts on how to reproduce this problem easily (without heavy
networking traffic)?

Thanks in advance!
