Re: [PATCH v7 0/7] Introduce ZONE_CMA

From: Joonsoo Kim
Date: Thu Apr 20 2017 - 21:35:32 EST


On Mon, Apr 17, 2017 at 11:02:12AM +0900, Joonsoo Kim wrote:
> On Thu, Apr 13, 2017 at 01:56:15PM +0200, Michal Hocko wrote:
> > On Wed 12-04-17 10:35:06, Joonsoo Kim wrote:
> > > On Tue, Apr 11, 2017 at 08:15:20PM +0200, Michal Hocko wrote:
> > > > Hi,
> > > > I didn't get to read though patches yet but the cover letter didn't
> > > > really help me to understand the basic concepts to have a good starting
> > > > point before diving into implementation details. It contains a lot of
> > > > history remarks which is not bad but IMHO too excessive here. I would
> > > > appreciate the following information (some of that is already provided
> > > > in the cover but could benefit from some rewording/text reorganization).
> > > >
> > > > - what is ZONE_CMA and how it is configured (from admin POV)
> > > > - how does ZONE_CMA compare to other zones
> > > > - who is allowed to allocate from this zone and what are the
> > > > guarantees/requirements for successful allocation
> > > > - how does the zone compare to a preallocate allocation pool
> > > > - how is ZONE_CMA balanced/reclaimed due to internal memory pressure
> > > > (from CMA users)
> > > > - is this zone reclaimable for the global memory reclaim
> > > > - why this was/is controversial
> > >
> > > Hello,
> > >
> > > I hope that following summary helps you to understand this patchset.
> > > I skip some basic things about CMA. I will attach this description to
> > > the cover-letter if re-spin is needed.
> >
> > I believe that sorting out these questions is more important than what
> > you have in the current cover letter. Andrew tends to fold the cover
> > into the first patch so I think you should update.
>
> Okay.
>
> > > 2. How does ZONE_CMA compare to other zones
> > >
> > > ZONE_CMA is conceptually the same with ZONE_MOVABLE. There is a software
> > > constraint to guarantee the success of future allocation request from
> > > the device. If the device requests the specific range of the memory in CMA
> > > area at the runtime, page that allocated by MM will be migrated to
> > > the other page and it will be returned to the device. To guarantee it,
> > > ZONE_CMA only takes the allocation request with GFP_MOVABLE.
> >
> > The immediate follow up question is. Why cannot we reuse ZONE_MOVABLE
> > for that purpose?
>
> I can make CMA reuses the ZONE_MOVABLE but I don't want it. Reasons
> are that
>
> 1. If ZONE_MOVABLE has two different types of memory, hotpluggable and
> CMA, it may need special handling for each type. This would lead to a new
> migratetype again (to distinguish them) and easy to be error-prone. I
> don't want that case.
>
> 2. CMA users want to see usage stat separately since CMA often causes
> the problems and separate stat would helps to debug it.
>
> > > The other important point about ZONE_CMA is that span of ZONE_CMA would be
> > > overlapped with the other zone. This is not new to MM subsystem and
> > > MM subsystem has enough logic to handle such situation
> > > so there would be no problem.
> >
> > I am not really sure this is actually true. Zones are disjoint from the
> > early beginning. I remember that we had something like numa nodes
> > interleaving but that is such a rare configuration that I wouldn't be
> > surprised if it wasn't very well tested and actually broken in some
> > subtle ways.
>
> I agree with your concern however if something is broken for them, it
> just shows that we need to fix it. MM should handle this situation
> since we already know that such architecture exists.
>
> >
> > There are many page_zone(page) != zone checks sprinkled in the code but
> > I do not see anything consistent there. Similarly pageblock_pfn_to_page
> > is only used by compaction but there are other pfn walkers which do
> > ad-hoc checking. I was staring into that code these days due to my
> > hotplug patches.
> >
> > That being said, I think that interleaving zones are an interesting
> > concept but I would be rather nervous to consider this as working
> > currently without a deeper review.
>
> I have tried to audit all the pfn walkers before and have added above
> mentioned check. Perhaps, I missed something however I believe not
> that much. Our production already have used ZONE_CMA and I haven't get
> the report about such problem.
>
> >
> > > Other things are completely the same with other zones. For MM POV, there is
> > > no difference in allocation process except that it only takes
> > > GFP_MOVABLE request. In reclaim, pages that are allocated by MM will
> > > be reclaimed by the same policy of the MM. So, no difference.
> >
> > OK, so essentially this is yet another "highmem" zone. We already know
> > that only GFP_MOVABLE are allowed to fallback to ZONE_CMA but do CMA
> > allocations fallback to other zones and punch new holes? In which zone
> > order?
>
> Hmm... I don't understand your question. Could you elaborate it more?
>
> > > This 'no difference' is a strong point of this approach. ZONE_CMA is
> > > naturally handled by MM subsystem unlike as before (special handling is
> > > required for MIGRATE_CMA).
> > >
> > > 3. Controversial Point
> > >
> > > Major concern from Mel is that zone concept is abused. ZONE is originally
> > > introduced to solve some issues due to H/W addressing limitation.
> >
> > Yes, very much agreed on that. You basically want to punch holes into
> > other zones to guarantee an allocation progress. Marking those wholes
> > with special migrate type sounds quite natural but I will have to study
> > the current code some more to see whether issues you mention are
> > inherently unfixable. This might very well turn out to be the case.
>
> At a glance, special migratetype sound natural. I also did. However,
> it's not natural in implementation POV. Zone consists of the same type
> of memory (by definition ?) and MM subsystem is implemented with that
> assumption. If difference type of memory shares the same zone, it easily
> causes the problem and CMA problems are the such case.

Hello, Michal.

If you don't have any more question, I will send next version with
updated cover-letter.

Thanks.