Re: [PATCH v3 0/6] Introduce ZONE_CMA

From: Joonsoo Kim
Date: Fri May 27 2016 - 02:41:25 EST


On Fri, May 27, 2016 at 02:25:27PM +0800, Feng Tang wrote:
> On Fri, May 27, 2016 at 01:28:20PM +0800, Joonsoo Kim wrote:
> > On Thu, May 26, 2016 at 04:04:54PM +0800, Feng Tang wrote:
> > > On Thu, May 26, 2016 at 02:22:22PM +0800, js1304@xxxxxxxxx wrote:
> > > > From: Joonsoo Kim <iamjoonsoo.kim@xxxxxxx>
> > >
>
> > > > FYI, there is another attempt [3] trying to solve this problem in lkml.
> > > > And, as far as I know, Qualcomm also has out-of-tree solution for this
> > > > problem.
> > >
> > > This may be a little off-topic :) Actually, we have used another way in
> > > our products, that we disable the fallback from MIGRATETYE_MOVABLE to
> > > MIGRATETYPE_CMA completely, and only allow free CMA memory to be used
> > > by file page cache (which is easy to be reclaimed by its nature).
> > > We did it by adding a GFP_PAGE_CACHE to every allocation request for
> > > page cache, and the MM will try to pick up an available free CMA page
> > > first, and goes to normal path when fail.
> >
> > Just wonder, why do you allow CMA memory to file page cache rather
> > than anonymous page? I guess that anonymous pages would be more easily
> > migrated/reclaimed than file page cache. In fact, some of our product
> > uses anonymous page adaptation to satisfy similar requirement by
> > introducing GFP_CMA. AFAIK, some of chip vendor also uses "anonymous
> > page first adaptation" to get better success rate.
>
> The biggest problem we faced is to allocate big chunk of CMA memory,
> say 256MB in a whole, or 9 pieces of 20MB buffers, so the speed
> is not the biggest concern, but whether all the cma pages be reclaimed.

Okay. Our product have similar workload.

> With the MOVABLE fallback, there may be many types of bad guys from device
> drivers/kernel or different subsystems, who refuse to return the borrowed
> cma pages, so I took a lazy way by only allowing page cache to use free
> cma pages, and we see good results which could pass most of the test for
> allocating big chunks.

Could you explain more about why file page cache rather than anonymous page?
If there is a reason, I'd like to test it by myself.

> One of the customer used to use a CMA sharing patch from another vendor
> on our Socs, which can't pass these tests and finally took our page cache
> approach.

CMA has too many problems so each vendor uses their own adaptation. I'd
like to solve this code fragmentation by fixing problems on upstream
kernel and this ZONE_CMA is one of that effort. If you can share the
pointer for your adaptation, it would be very helpful to me.

Thanks.

> >
> > > It works fine on our products, though we still see some cases that
> > > some page can't be reclaimed.
> > >
> > > Our product has a special user case of CMA, that sometimes it will
> > > need to use the whole CMA memory (say 256MB on a phone), then all
> >
> > I don't think this usecase is so special. Our product also has similar
> > usecase. And, I already knows one another.
>
> :) I first touch CMA in 2014 and have only worked on Sofia platforms.
>
> >
> > > share out CMA pages need to be reclaimed all at once. Don't know if
> > > this new ZONE_CMA approach could meet this request? (our page cache
> > > solution can't ganrantee to meet this request all the time).
> >
> > This ZONE_CMA approach would be better than before, since CMA memory
> > is not be used for blockdev page cache. Blockdev page cache is one of
> > the frequent failure points in my experience.
>
> Indeed! I also explicitely disabled cma sharing for blkdev FS page cache.
>
> >
> > I'm not sure that ZONE_CMA works better than your GFP_PAGE_CACHE
> > adaptation for your system. In ZONE_CMA, CMA memory is used for file
> > page cache or anonymous pages. If my assumption that anonymous pages
> > are easier to be migrated/reclaimed is correct, ZONE_CMA would work
> > better than your adaptation since there is less file page cache pages
> > in CMA memory.
> >
> > Anyway, it also doesn't guarantee to succeed all the time. There is
> > different kind of problem that prevents CMA allocation success and we
> > need to solve it. I will try it after problems that this patchset try
> > to fix is solved.
>
> ZONE_CMA should be cleaner, while our page cache solution needs to
> adjust some policy for lowmemorykiller and page scan/reclaim code.
>
> Thanks,
> Feng
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>