Re: [PATCH] mm: limit direct reclaim for higher order allocations

From: Joonsoo Kim
Date: Wed Feb 24 2016 - 23:43:00 EST


On Wed, Feb 24, 2016 at 09:47:27PM -0500, Rik van Riel wrote:
> On Thu, 2016-02-25 at 09:30 +0900, Joonsoo Kim wrote:
> > On Wed, Feb 24, 2016 at 05:17:56PM -0500, Rik van Riel wrote:
> > > On Wed, 2016-02-24 at 14:15 -0800, David Rientjes wrote:
> > > > On Wed, 24 Feb 2016, Rik van Riel wrote:
> > > >
> > > > > For multi page allocations smaller than
> > > > > PAGE_ALLOC_COSTLY_ORDER,
> > > > > the kernel will do direct reclaim if compaction failed for any
> > > > > reason. This worked fine when Linux systems had 128MB RAM, but
> > > > > on my 24GB system I frequently see higher order allocations
> > > > > free up over 3GB of memory, pushing all kinds of things into
> > > > > swap, and slowing down applications.
> > > > >  
> > > >
> > > > Just curious, are these higher order allocations typically done
> > > > by
> > > > the 
> > > > slub allocator or where are they coming from?
> > >
> > > These are slab allocator ones, indeed.
> > >
> > > The allocations seem to be order 2 and 3, mostly
> > > on behalf of the inode cache and alloc_skb.
> >
> > Hello, Rik.
> >
> > Could you tell me the kernel version you tested?
> >
> > Commit 45eb00cd3a03 (mm/slub: don't wait for high-order page
> > allocation) changes slub allocator's behaviour that high order
> > allocation request by slub doesn't cause direct reclaim.
>
> The system I observed the problem on has a
> 4.2 based kernel on it. That would explain.
>
> Are we sure the problem is limited just to
> slub, though?

AFAIK, there is one more notable place to request high-order page,
allocation for thread_info. However, it would be much less aggressive
than slub one. Please refer THREAD_SIZE_ORDER definition.

If we need to fix this situation, I think that it is better to make
shrink_zone_memcg() to consider allocation requested order. Entering
direct reclaim means that async compaction already fails for this
low order. Although sync compaction has much more power than async one
but it is possible that compaction would not work well at that time.
Because this low order allocation is something we take care about
unlike PAGE_ALLOC_COSTLY_ORDER allocation, I think that
small amount of reclaim is better than just skipping it.

Thanks.