Re: Simple script that locks up my box with recent kernels

From: Jesper Juhl
Date: Thu Nov 23 2006 - 19:51:07 EST


On 22/11/06, Linus Torvalds <torvalds@xxxxxxxx> wrote:


On Wed, 22 Nov 2006, Jesper Juhl wrote:
>
> So it *seems* to be somehow related to running low on RAM and swap
> starting to be used.

Does it happen if you just do some simple "use all memory" script, eg run
a few copies of

#define SIZE (100<<20)

char *buf = malloc(SIZE);
memset(buf, SIZE, 0);
sleep(100);

on your box?


No. That doesn't kill the box. It very effectively turns it into a
slug (bigtime) but it doesn't kill it.

Running just a single copy is no problem. Neither is running 4 or 5 in
parallel.
Doing
for i in $(seq 1 30); do ./a.out & done
turns the box into a slug for 5 minutes or so, but then when all the
processes have terminated and another few minutes have passed it is
back to normal.
Running
for i in $(seq 1 100); do ./a.out & done
is a different story though. Starting the first ~40 processes happens
relatively fast, then starting the next 10-20 or so happens very
slowly (5-10 sec intervals between each one), then it starts taking
something like 20-30 seconds for each new process to start and when we
get somewhere around 75-85 processes started the box appears to be
hung, except that sysrq still works and I can still switch tty's with
ctrl+alt+F?. After a few minutes in this almost-hung state the Oom
killer kicks in and kills a few of the processes and after some
additional minutes all 100 processes eventually get started and
sometimes a few have even started to die off as well. Once all 100
processes have been started it takes somewhere around 5-10 minutes for
them all to terminate (most terminate normally, some die with
"segmentation fault" and they die off roughly in the order they got
started). The biggest problem after all processes have terminated is
then that the box remains a slug. I left it alone for ~10 minutes at
this point and when I came back it was still not back to normal (and
trying to do a normal reboot took so long that I eventually lost my
patience and used sysrq+b to boot it).


--
Jesper Juhl <jesper.juhl@xxxxxxxxx>
Don't top-post http://www.catb.org/~esr/jargon/html/T/top-post.html
Plain text mails only, please http://www.expita.com/nomime.html
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/