Re: Linux 2.4 VS 2.6 fork VS thread creation time test

From: David Lang
Date: Sun May 23 2004 - 05:02:22 EST


On Sun, 23 May 2004, Christian Borntraeger wrote:

> Date: Sun, 23 May 2004 11:39:41 +0200
> From: Christian Borntraeger <linux-kernel@xxxxxxxxxxxxxxx>
> To: linux-kernel@xxxxxxxxxxxxxxx
> Cc: Gergely Czuczy <phoemix@xxxxxxxxxxx>, itk-sysadm@xxxxxxx
> Subject: Re: Linux 2.4 VS 2.6 fork VS thread creation time test
>
> Gergely Czuczy wrote:
> > failed. As I told it above all the processes are teminated right after
> > creation, but there were a lot of defunct processes in the system, and
> > they were only gone when the parent termineted.
>
> Have you heard of wait, waitpid and pthread_join?

there really is some sort of problem with 2.6.6 in this area. I have an
app that I am trying to stress test on a dual opteron system and under a
heavy load something goes haywire and the children become zombies. on a
dual athlon the test manages 2500 forks/sec and can continue forever (Ok,
I only tested it to 11M forks at full speed :-), but the dual opteron box
manages 3500 connections/sec for a few thousand connections and then stops
reaping the children. if I attach strace to the parent at this point the
logjam is broken and strace shows the wait calls receiving and handleing
the sigchild

the prarent deals with sigchild by
handler{
while ( wait(...) >0);
signal(SIGCHLD, handler);
}

unfortunantly trying to leave strace attached while running the test slows
it down to ~1200 forks/sec and the problem never forms.

David Lang

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/