RE: [RFC v3 1/2] mm, compaction: introduce kcompactd

Date: Tue Aug 11 2015 - 04:51:25 EST


> -----Original Message-----
> From: Vlastimil Babka [mailto:vbabka@xxxxxxx]
> Sent: Monday, August 10, 2015 3:14 PM
> To: PINTU KUMAR; linux-mm@xxxxxxxxx
> Cc: linux-kernel@xxxxxxxxxxxxxxx; Andrew Morton; Hugh Dickins; Andrea
> Arcangeli; Kirill A. Shutemov; Rik van Riel; Mel Gorman; David Rientjes; Joonsoo
> Kim; Pintu Kumar
> Subject: Re: [RFC v3 1/2] mm, compaction: introduce kcompactd
> On 08/09/2015 05:37 PM, PINTU KUMAR wrote:
> >> Waking up of the kcompactd threads is also tied to kswapd activity
> >> and follows these rules:
> >> - we don't want to affect any fastpaths, so wake up kcompactd only from the
> >> slowpath, as it's done for kswapd
> >> - if kswapd is doing reclaim, it's more important than compaction, so
> >> don't
> >> invoke kcompactd until kswapd goes to sleep
> >> - the target order used for kswapd is passed to kcompactd
> >>
> >> The kswapd compact/reclaim loop for high-order pages is left alone
> >> for now and precedes kcompactd wakeup, but this might be revisited later.
> >
> > kcompactd, will be really nice thing to have, but I oppose calling it from
> kswapd.
> > Because, just after kswapd, we already have direct_compact.
> Just to be clear, here you mean that kswapd already does the compact/reclaim
> loop?
No, I mean in slowpath, after kswapd, there is already direct_compact/reclaim.

> > So it may end up in doing compaction 2 times.
> The compact/reclaim loop might already do multiple iterations. The point is,
> kswapd will terminate the loop as soon as single page of desired order becomes
> available. Kcompactd is meant to go beyond that.
> And having kcompactd run in parallel with kswapd's reclaim looks like nonsense
> to me, so I don't see other way than have kswapd wake up kcompactd when it's
> finished.
But, if kswapd is disabled then even kcompactd will not be called. Then it will be same situation.
Just a thought, how about creating a kworker thread for performing kcompactd?
May be schedule it on demand (based on current fragmentation level of COSTLY_ORDER), from other sub-system.
Or, may be invoke it when direct_reclaim fails.
Because, as per my observation, running compaction, immediately after reclaim gives more benefit.
How about tracking all higher order in kernel and understand who actually needs it.

> > Or, is it like, with kcompactd, we dont need direct_compact?
> That will have to be evaluated. It would be nice to not need the compact/reclaim
> loop, but I'm not sure it's always possible. We could move it to kcompactd, but it
> would still mean that no daemon does exclusively just reclaim or just
> compaction.
> > In embedded world situation is really worse.
> > As per my experience in embedded world, just compaction does not help
> always in longer run.
> >
> > As I know there are already some Android model in market, that already run
> background compaction (from user space).
> > But still there are sluggishness issues due to bad memory state in the long run.
> It should still be better with background compaction than without it. Of course,
> avoiding a permanent fragmentation completely is not possible to guarantee as it
> depends on the allocation patterns.
> > In embedded world, the major problems are related to camera and browser use
> cases that requires almost order-8 allocations.
> > Also, for low RAM configurations (less than 512M, 256M etc.), the rate of
> failure of compaction is much higher than the rate of success.
> I was under impression that CMA was introduced to deal with such high-order
> requirements in the embedded world?
CMA has its own limitations and drawbacks (because of movable pages criteria).
Please check this:
So, for low RAM devices we try to make CMA as tight and low as possible.
For IOMMU supported devices (camera etc.), we donât need CMA.
For Android case, they use ION system heap that rely on higher-order (with fallback mechanism), then perform scatter/gather.
For more information, please check this:

> > How can we guarantee that kcompactd is suitable for all situations?
> We can't :) we can only hope to improve the average case. Anything that needs
> high-order *guarantees* has to rely on CMA or another kind of reservation (yeah
> even CMA is a pageblock reservation in some sense).
> > In an case, we need large amount of testing to cover all scenarios.
> > It should be called at the right time.
> > I dont have any data to present right now.
> > May be I will try to capture some data, and present here.
> That would be nice. I'm going to collect some as well.

Specially, I would like to see the results on low RAM (less than 512M).
I will also share if I get anything interesting.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at