Re: Some questions about linux kernel.

From: Jason Gunthorpe (jgg@ualberta.ca)
Date: Tue Mar 14 2000 - 01:33:12 EST


On Mon, 13 Mar 2000, James Sutherland wrote:

> I want to emphasise this point: the OOM handler should **NEVER** be
> invoked. If it is ever needed, there is something BADLY wrong with the
> system - either we do something drastic (process slaughtering) or the
> machine goes down completely. Worrying about which user processes are
> killed to avoid an uncontrolled failure is a case of rearranging the
> deckchairs on the Titanic.

Just as someone who has been bitten by this, I agree with James. The
number one priority is keeping the machine up - I don't care what you
kill, just keep it running and preferably allow remote logins still...

I'd think there are only two ways you can accidently OOM a box - a single
process goes nutz or something like a fork bomb. I'd think the simplest
solution is to look for the process using the most memory or the user with
the most processes+memory (some weird weight factor here) and blow it
away. If this means jonhnie looses their shell, their emacs and whatever
else because someone fork bombed a box then too bad. If this means apache
gets blown away because a CGI went insane then too bad. When the
alternative is having to drive to the site and hit the reset button (which
had to be done, twice :P) it is a minor trouble.

In my case it turned out a WML process got into an infinite ram using
loop, on 2.2.6-ac1 the machine just crashed - no kernel log messages,
nothing, just dead. Unnaceptable :> In this particular case the wml
process got up to some 400meg of ram usage before things got really out of
hand. Lots of swap! These days there is a ulimit protecting these wml
processes, but it was really hit and miss to find out that it was indeed
wml that was causing this.. (Heck, it was a lucky guess it was even OOM)

IMHO a completely seperate topic is how to prevent a malicious user from
abusing this to nuke a machine.. That sounds like a very difficult prolem
indeed.

Jason

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Mar 15 2000 - 21:00:27 EST