Re: [PATCH] mm,page_alloc,cma: conditionally prefer cma pageblocks for movable allocations

From: Roman Gushchin
Date: Wed Apr 01 2020 - 22:54:29 EST


On Wed, Apr 01, 2020 at 07:13:22PM -0700, Andrew Morton wrote:
> On Thu, 12 Mar 2020 10:41:28 +0900 Joonsoo Kim <js1304@xxxxxxxxx> wrote:
>
> > Hello, Roman.
> >
> > 2020ë 3ì 12ì (ë) ìì 2:35, Roman Gushchin <guro@xxxxxx>ëì ìì:
> > >
> > > On Wed, Mar 11, 2020 at 09:51:07AM +0100, Vlastimil Babka wrote:
> > > > On 3/6/20 9:01 PM, Rik van Riel wrote:
> > > > > Posting this one for Roman so I can deal with any upstream feedback and
> > > > > create a v2 if needed, while scratching my head over the next piece of
> > > > > this puzzle :)
> > > > >
> > > > > ---8<---
> > > > >
> > > > > From: Roman Gushchin <guro@xxxxxx>
> > > > >
> > > > > Currently a cma area is barely used by the page allocator because
> > > > > it's used only as a fallback from movable, however kswapd tries
> > > > > hard to make sure that the fallback path isn't used.
> > > >
> > > > Few years ago Joonsoo wanted to fix these kinds of weird MIGRATE_CMA corner
> > > > cases by using ZONE_MOVABLE instead [1]. Unfortunately it was reverted due to
> > > > unresolved bugs. Perhaps the idea could be resurrected now?
> > >
> > > Hi Vlastimil!
> > >
> > > Thank you for this reminder! I actually looked at it and also asked Joonsoo in private
> > > about the state of this patch(set). As I understand, Joonsoo plans to resubmit
> > > it later this year.
> > >
> > > What Rik and I are suggesting seems to be much simpler, however it's perfectly
> > > possible that Joonsoo's solution is preferable long-term.
> > >
> > > So if the proposed patch looks ok for now, I'd suggest to go with it and return
> > > to this question once we'll have a new version of ZONE_MOVABLE solution.
> >
> > Hmm... utilization is not the only matter for CMA user. The more
> > important one is
> > success guarantee of cma_alloc() and this patch would have a bad impact on it.
> >
> > A few years ago, I have tested this kind of approach and found that increasing
> > utilization increases cma_alloc() failure. Reason is that the page
> > allocated with
> > __GFP_MOVABLE, especially, by sb_bread(), is sometimes pinned by someone.
> >
> > Until now, cma memory isn't used much so this problem doesn't occur easily.
> > However, with this patch, it would happen.
>
> So I guess we keep Roman's patch on hold pending clarification of this
> risk?

The problem here is that we can't really find problems if we don't use CMA as intended
and just leave it free. Me and Rik are actively looking for page migration problems
in our production, and we've found and fixed some of them. Our setup is likely different
from embedded guys who are in my understanding most active cma users, so even if we
don't see any issues I can't guarantee it for everybody.

So given Joonsoo's ack down in the thread (btw, I'm sorry I've missed a good optimization
he suggested, will send a patch for that), I'd go with this patch at least until
the first complain. I can prepare a patch to add some debugging to the page migration
path so we'll get an idea what fails.

As a safety measure, we can make it conditional depending on the hugetlb_cma kernel
argument, which will exclude any regression possibility for the majority of users.
But I don't think we have a good reason for it now.

Thanks!