Re: Linux killed Kenny, bastard!

From: Evgeniy Polyakov
Date: Tue Jan 13 2009 - 03:52:58 EST


On Mon, Jan 12, 2009 at 05:53:47PM -0800, David Rientjes (rientjes@xxxxxxxxxx) wrote:
> On Tue, 13 Jan 2009, Evgeniy Polyakov wrote:
>
> > Like anything that spawns a thread or process per request/client, or
> > preallocates set of them which connect to the huge object like database.
> > Most of the time database/server is killed first instead of comparably
> > small clients.
>
> No, the reverse is true: when a task is chosen for oom kill based on the
> badness heuristic, the oom killer first attempts to kill any child task
> that isn't attached to the same mm. If the child shares an mm, both tasks
> must die before memory freeing can occur.

It is a theory, not a practice. OOM-killer most of time starts from ssh,
database and lighttpd on the tested machines, when it could start in
the reverse order and do not touch ssh at all. Better not from daemon
itself, but its fastcgi spawned processes.

> > In some cases it is possible to tune the environment, in
> > others it is not that simple. This patch works for such situatons
> > perfectly and does not require additional administrative burden, since
> > it does not make thinge worse as a whole, but only better for the very
> > commonly used cases, that's why I propose it for inclusion.
> >
>
> It's an inappropriate addition since /proc/pid/oom_adj scores exist which
> can prefer or protect certain tasks over others when the oom killer
> chooses a target, including oom kill immunity. These scores are inherited
> from parent tasks and can be tuned after the fork to your oom kill target
> preference.

I agree, that there are ways to tune the way oom-killer selects the
victim, and likely after hours of games this subtly will work for the
specified workload. What I propose is the simplest way for the most
commonly used case. It is a help for the admin and not the force to
invent complex machinery which will be error-prone and hard to debug
when eventually oom happens. This will work, but it is way more complex
than what I propose, without immediately visible net effects on other
parts of the originally balanced system.

--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/