Re: Misleading OOM messages

From: Christoph Lameter
Date: Fri May 22 2009 - 10:59:09 EST


On Tue, 19 May 2009, Pavel Machek wrote:

> > Well that is of course not enough memory.
>
> Ok, so in the end, there are two reasons for OOM:
>
> 1) Out of virtual memory.
>
> there's simply not enough ram+swap to fit the data. You go OOM.
> This seems to be common on small machines. 8M is pushing it, but
> 64M ram + 64M swap + todays gnome would probably do that.
>
> And maybe the way to hint people would be printing 'out of
> _virtual_ memory'.

This is only an issue for anonymous page and is therefore load dependent.
Memory can be provided through additional swap space.


> 2) Something goes very wrong with reclaim
>
> this seems to be common on very big machines you have experience
> with.
>
> Perhaps 1 and 2 can be told appart by zero swap free in the 1) case?

We could add a message spitting out a warning in get_swap_page() to cover
the "out of memory" case. Would be triggered once only when we first run
out of swap space.

> And perhaps you can invent some better message for 2) case?

The something-goes-wrong with reclaim occurs for a variety of reasons
on other machines. Even on the small machine that I currently work with.
I am not in the embedded space right now so this likely means that I do
not see the out of swap -> OOM condition. The out of memory issues that
I see are misconfigurations on a varity of levels. On top right now is
running out of memory on 32 bit machines since someone put too much memory
into them. Thus ZONE_NORMAL gets exhausted.

Then they add more memory and therefore OOM occurs faster. Which leaves
them somewhat confused.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/