Re: [PATCH v3 0/6] Introduce ZONE_CMA

From: Feng Tang
Date: Fri May 27 2016 - 02:23:44 EST


On Fri, May 27, 2016 at 01:28:20PM +0800, Joonsoo Kim wrote:
> On Thu, May 26, 2016 at 04:04:54PM +0800, Feng Tang wrote:
> > On Thu, May 26, 2016 at 02:22:22PM +0800, js1304@xxxxxxxxx wrote:
> > > From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> >

> > > FYI, there is another attempt [3] trying to solve this problem in lkml.
> > > And, as far as I know, Qualcomm also has out-of-tree solution for this
> > > problem.
> >
> > This may be a little off-topic :) Actually, we have used another way in
> > our products, that we disable the fallback from MIGRATETYE_MOVABLE to
> > MIGRATETYPE_CMA completely, and only allow free CMA memory to be used
> > by file page cache (which is easy to be reclaimed by its nature).
> > We did it by adding a GFP_PAGE_CACHE to every allocation request for
> > page cache, and the MM will try to pick up an available free CMA page
> > first, and goes to normal path when fail.
>
> Just wonder, why do you allow CMA memory to file page cache rather
> than anonymous page? I guess that anonymous pages would be more easily
> migrated/reclaimed than file page cache. In fact, some of our product
> uses anonymous page adaptation to satisfy similar requirement by
> introducing GFP_CMA. AFAIK, some of chip vendor also uses "anonymous
> page first adaptation" to get better success rate.

The biggest problem we faced is to allocate big chunk of CMA memory,
say 256MB in a whole, or 9 pieces of 20MB buffers, so the speed
is not the biggest concern, but whether all the cma pages be reclaimed.

With the MOVABLE fallback, there may be many types of bad guys from device
drivers/kernel or different subsystems, who refuse to return the borrowed
cma pages, so I took a lazy way by only allowing page cache to use free
cma pages, and we see good results which could pass most of the test for
allocating big chunks.

One of the customer used to use a CMA sharing patch from another vendor
on our Socs, which can't pass these tests and finally took our page cache
approach.

>
> > It works fine on our products, though we still see some cases that
> > some page can't be reclaimed.
> >
> > Our product has a special user case of CMA, that sometimes it will
> > need to use the whole CMA memory (say 256MB on a phone), then all
>
> I don't think this usecase is so special. Our product also has similar
> usecase. And, I already knows one another.

:) I first touch CMA in 2014 and have only worked on Sofia platforms.

>
> > share out CMA pages need to be reclaimed all at once. Don't know if
> > this new ZONE_CMA approach could meet this request? (our page cache
> > solution can't ganrantee to meet this request all the time).
>
> This ZONE_CMA approach would be better than before, since CMA memory
> is not be used for blockdev page cache. Blockdev page cache is one of
> the frequent failure points in my experience.

Indeed! I also explicitely disabled cma sharing for blkdev FS page cache.

>
> I'm not sure that ZONE_CMA works better than your GFP_PAGE_CACHE
> adaptation for your system. In ZONE_CMA, CMA memory is used for file
> page cache or anonymous pages. If my assumption that anonymous pages
> are easier to be migrated/reclaimed is correct, ZONE_CMA would work
> better than your adaptation since there is less file page cache pages
> in CMA memory.
>
> Anyway, it also doesn't guarantee to succeed all the time. There is
> different kind of problem that prevents CMA allocation success and we
> need to solve it. I will try it after problems that this patchset try
> to fix is solved.

ZONE_CMA should be cleaner, while our page cache solution needs to
adjust some policy for lowmemorykiller and page scan/reclaim code.

Thanks,
Feng