Re: Possible ways of dealing with OOM conditions.

From: Evgeniy Polyakov
Date: Sat Jan 20 2007 - 20:51:14 EST


On Sat, Jan 20, 2007 at 05:36:03PM -0500, Rik van Riel (riel@xxxxxxxxxxx) wrote:
> Evgeniy Polyakov wrote:
> >On Fri, Jan 19, 2007 at 01:53:15PM +0100, Peter Zijlstra
> >(a.p.zijlstra@xxxxxxxxx) wrote:
>
> >>>Even further development of such idea is to prevent such OOM condition
> >>>at all - by starting swapping early (but wisely) and reduce memory
> >>>usage.
> >>These just postpone execution but will not avoid it.
> >
> >No. If system allows to have such a condition, then
> >something is broken. It must be prevented, instead of creating special
> >hacks to recover from it.
>
> Evgeniy, you may want to learn something about the VM before
> stating that reality should not occur.

I.e. I should start believing that OOM can not be prevented, bugs can
not be fixed and things can not be changed just because it happens right
now? That is why I'm not subscribed to lkml :)

> Due to the way everything in the kernel works, you cannot
> prevent the memory allocator from allocating everything and
> running out, except maybe by setting aside reserves to deal
> with special subsystems.
>
> As for your "swapping early and reduce memory usage", that is
> just not possible in a system where a memory writeout may need
> one or more memory allocations to succeed and other I/O paths
> (eg. file writes) can take memory from the same pools.

When system starts swapping only when it can not allocate new page,
then it is broken system. I bet you get warm closing way before you
hands are frostbitten, and you do not have a liter of alcohol in the
packet for such emergency. And to get warm closing you still need to
go over cold street into the shop, but you will do it before weather
becomes arctic.

> With something like iscsi it may be _necessary_ for file writes
> and swap to take memory from the same pools, because they can
> share the same block device.

Of course swapping can require additional allocation, when it happens
over network it is quite obvious.

The main problem is the fact, that if system was put into the state,
when its life depends on the last possible allocation, then it is
broken.

There is a light connected to car's fuel tank which starts blinking,
when amount of fuel is less then predefined level. Car just does not
stop suddenly and starts to get fuel from reserve (well eventually it
stops, but it says about problem long before it dies).

> Please get out of your fantasy world and accept the constraints
> the VM has to operate under. Maybe then you and Peter can agree
> on something.

I can not accept the situation, when problem is not fixed, but instead
recovery path is added. There must be both ways of dealing with it -
emergency force majeur recovery and preventive steps.

What we are talking about (except pointing to obvious things and sending
to school-classes), at least how I see this, is ways of dealing with
possible OOM condition. If OOM has happend, then there must be recovery
path, but OOM must be prevented, and ways to do this were described too.

> --
> Politics is the struggle between those who want to make their country
> the best in the world, and those who believe it already is. Each group
> calls the other unpatriotic.

--
Evgeniy Polyakov
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/