Re: [RFC PATCH 00/10] redesign compaction algorithm

From: Mel Gorman
Date: Thu Jun 25 2015 - 14:41:51 EST


On Fri, Jun 26, 2015 at 03:14:39AM +0900, Joonsoo Kim wrote:
> > It could though. Reclaim/compaction is entered for orders higher than
> > PAGE_ALLOC_COSTLY_ORDER and when scan priority is sufficiently high.
> > That could be adjusted if you have a viable case where orders <
> > PAGE_ALLOC_COSTLY_ORDER must succeed and currently requires excessive
> > reclaim instead of relying on compaction.
>
> Yes. I saw this problem in real situation. In ARM, order-2 allocation
> is requested
> in fork(), so it should be succeed. But, there is not enough order-2 freepage,
> so reclaim/compaction begins. Compaction fails repeatedly although
> I didn't check exact reason.

That should be identified and repaired prior to reimplementing
compaction because it's important.

> >> >> 3) Compaction capability is highly depends on migratetype of memory,
> >> >> because freepage scanner doesn't scan unmovable pageblock.
> >> >>
> >> >
> >> > For a very good reason. Unmovable allocation requests that fallback to
> >> > other pageblocks are the worst in terms of fragmentation avoidance. The
> >> > more of these events there are, the more the system will decay. If there
> >> > are many of these events then a compaction benchmark may start with high
> >> > success rates but decay over time.
> >> >
> >> > Very broadly speaking, the more the mm_page_alloc_extfrag tracepoint
> >> > triggers with alloc_migratetype == MIGRATE_UNMOVABLE, the faster the
> >> > system is decaying. Having the freepage scanner select unmovable
> >> > pageblocks will trigger this event more frequently.
> >> >
> >> > The unfortunate impact is that selecting unmovable blocks from the free
> >> > csanner will improve compaction success rates for high-order kernel
> >> > allocations early in the lifetime of the system but later fail high-order
> >> > allocation requests as more pageblocks get converted to unmovable. It
> >> > might be ok for kernel allocations but THP will eventually have a 100%
> >> > failure rate.
> >>
> >> I wrote rationale in the patch itself. We already use non-movable pageblock
> >> for migration scanner. It empties non-movable pageblock so number of
> >> freepage on non-movable pageblock will increase. Using non-movable
> >> pageblock for freepage scanner negates this effect so number of freepage
> >> on non-movable pageblock will be balanced. Could you tell me in detail
> >> how freepage scanner select unmovable pageblocks will cause
> >> more fragmentation? Possibly, I don't understand effect of this patch
> >> correctly and need some investigation. :)
> >>
> >
> > The long-term success rate of fragmentation avoidance depends on
> > minimsing the number of UNMOVABLE allocation requests that use a
> > pageblock belonging to another migratetype. Once such a fallback occurs,
> > that pageblock potentially can never be used for a THP allocation again.
> >
> > Lets say there is an unmovable pageblock with 500 free pages in it. If
> > the freepage scanner uses that pageblock and allocates all 500 free
> > pages then the next unmovable allocation request needs a new pageblock.
> > If one is not completely free then it will fallback to using a
> > RECLAIMABLE or MOVABLE pageblock forever contaminating it.
>
> Yes, I can imagine that situation. But, as I said above, we already use
> non-movable pageblock for migration scanner. While unmovable
> pageblock with 500 free pages fills, some other unmovable pageblock
> with some movable pages will be emptied. Number of freepage
> on non-movable would be maintained so fallback doesn't happen.
>
> Anyway, it is better to investigate this effect. I will do it and attach
> result on next submission.
>

Lets say we have X unmovable pageblocks and Y pageblocks overall. If the
migration scanner takes movable pages from X then there is more space for
unmovable allocations without having to increase X -- this is good. If
the free scanner uses the X pageblocks as targets then they can fill. The
next unmovable allocation then falls back to another pageblock and we
either have X+1 unmovable pageblocks (full steal) or a mixed pageblock
(partial steal) that cannot be used for THP. Do this enough times and
X == Y and all THP allocations fail.

--
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/