Re: [PATCH 3/3] mm: page allocator: Drain per-cpu lists afterdirect reclaim allocation fails

From: Wu Fengguang
Date: Tue Sep 07 2010 - 22:13:52 EST


On Tue, Sep 07, 2010 at 10:23:48PM +0800, Christoph Lameter wrote:
> On Mon, 6 Sep 2010, Dave Chinner wrote:
>
> > [ 596.628086] [<ffffffff81108a8c>] ? drain_all_pages+0x1c/0x20
> > [ 596.628086] [<ffffffff81108fad>] ? __alloc_pages_nodemask+0x42d/0x700
> > [ 596.628086] [<ffffffff8113d0f2>] ? kmem_getpages+0x62/0x160
> > [ 596.628086] [<ffffffff8113dce6>] ? fallback_alloc+0x196/0x240
>
> fallback_alloc() showing up here means that one page allocator call from
> SLAB has already failed.

That may be due to the GFP_THISNODE flag which includes __GFP_NORETRY
which may fail the allocation simply because there are many concurrent
page allocating tasks, but not necessary in real short of memory.

The concurrent page allocating tasks may consume all the pages freed
by try_to_free_pages() inside __alloc_pages_direct_reclaim(), before
the direct reclaim task is able to get it's page with
get_page_from_freelist(). Then should_alloc_retry() returns 0 for
__GFP_NORETRY which stops further retries.

In theory, __GFP_NORETRY might fail even without other tasks
concurrently stealing current task's direct reclaimed pages. The pcp
lists might happen to be low populated (pcp.count ranges 0 to pcp.batch),
and try_to_free_pages() might not free enough pages to fill them to
the pcp.high watermark, hence no pages are freed into the buddy system
and NR_FREE_PAGES increased. Then zone_watermark_ok() will remain
false and allocation fails. Mel's patch to increase accuracy of
zone_watermark_ok() should help this case.

> SLAB then did an expensive search through all
> object caches on all nodes to find some available object. There were no
> objects in queues at all therefore SLAB called the page allocator again
> (kmem_getpages()).
>
> As soon as memory is available (on any node or any cpu, they are all
> empty) SLAB will repopulate its queues(!).

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/