oom with 2.6.11

From: Christian Kujau
Date: Tue Mar 08 2005 - 10:23:12 EST


hallo list,

today my machine went out out memory and noticing it several hours after
the first OOM message in the log, i wonder
1) why this happened at all and
2) why almost every service was killed despite the clever algorithms
documented in mm/oom_kill.c.

the first oom message went to the syslog at 01:27, i was away and no heavy
tasks were scheduled:

http://nerdbynature.de/bits/sheep/2.6.11/oom/oom_2.6.11.txt

mysqld got killed by the oom killer, so i have to suspect mysql for being
the reason for oom here, even that i know that mysqld is running all day
long. several other tasks got killed, but "Free swap" stays at 0kB and the
oom killer kills almost every other tasks, with no success in freeing ram.

the log stops at 03:21, perhaps syslog-ng got killed.
at around 07:31 i noticed the mess, did SYSRQ-E and now i was able to
login again. i pressed SYSRQ-M/T/P too, they are all in the log. at this
time loadavg was at 249 ;)

i went to runlevel 2, then up again to 3 and all services are up and
running again.

some 2.6.11-rc3 BK snapshot was running pretty stable (no OOM) for ~30
days before i switched to 2.6.11 (vanilla) a few days ago. i have to (not)
reproduce the problem the next night, i wonder if it will happen again.

do you vm-gurus have any idea to the points asked above?

more infos about the box here: http://nerdbynature.de/bits/sheep/2.6.11/oom/


thank you for your comments,
Christian.
--
BOFH excuse #281:

The co-locator cannot verify the frame-relay gateway to the ISDN server.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/