Re: OOM problems on 2.6.12-rc1 with many fsx tests

From: Andrew Morton
Date: Wed Mar 23 2005 - 21:02:48 EST


Andrea Arcangeli <andrea@xxxxxxx> wrote:
>
> On Wed, Mar 23, 2005 at 03:42:32PM -0800, Andrew Morton wrote:
> > I'm suspecting here that we simply leaked a refcount on every darn
> > pagecache page in the machine. Note how mapped memory has shrunk down to
> > less than a megabyte and everything which can be swapped out has been
> > swapped out.
> >
> > If so, then oom-killing everything in the world is pretty inevitable.
>
> Agreed, it looks like a memleak of a page_count (while mapcount is fine).
>
> I would suggest looking after pages part of pagecache (i.e.
> page->mapcount not null) that have a mapcount of 0 and a page_count > 1,
> almost all of them should be like that during the memleak, and almost
> none should be like that before the memleak.
>
> This seems unrelated to the bug that started the thread that was clearly
> a slab shrinking issue and not a pagecache memleak.
>

The vmscan.c changes in -rc1 look harmless enough. That's assuming that
2.6.11 doesn't have the bug.

btw, that new orphanned-page handling code has a printk in it, and nobody
has reported it coming out yet...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/