Re: Local DoS (was: Strange 'zombie' problem both in 2.4 and 2.6)

From: David Lang
Date: Mon Jun 14 2004 - 20:34:38 EST


On Mon, 14 Jun 2004, Marcelo Tosatti wrote:
On Mon, Jun 14, 2004 at 10:01:53AM -0700, David Lang wrote:
I think I may be running into the same (or a similar) issue with a
workload that forks heavily (~3500 forks/sec). What can I do to let the
system survive this sort of load?

Hi David,

v2.6.7-mm tree contains a fix for this, adding a rlimit for
pending signals.

I'll have to give this a try.

Can you describe the problem you are seeing in more detail?

I have a stress-test I am running on a dual Opteron 1.4GHz box that receives a network connection, forks a new process, does a little bit of network traffic then the child exits. when I hammer this I get ~3500 connections/sec (with a significant amount of spare CPU, I'm limited by my load boxes), but after a few secnds (8-10) something happens and the parent stops receiving the sigchild signals. if I connect strace to the parent process the signals are re-enabled and everything works for a little bit longer before the process repeats.

if I only hit it with ~10,000 connections and then pause the box survives indefinantly

running the same test on a dual athlonMP 2200+ I get ~2500 connections a sec and it has no problems. I just compiled a 32 bit kernel for the opteron and get ~3300 connections/sec (with no idle CPU time) and the box doesn't lock up.

I don't know if this is becouse it's just below the threashold of the problem or if there is a bug in the 64 bit kernel (or both)

I'm currently trying to tweak the 32 bit opteron kernel to get a smidge more speed out of it to see if getting back up to the same speed starts triggering the problem again.

David Lang

--
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/