Re: [PATCH v2 07/10] mm, compaction: restrict async compaction to pageblocks of same migratetype

From: Vlastimil Babka
Date: Fri Feb 17 2017 - 11:32:21 EST

Next message: Thomas Gleixner: "Re: v4.10-rc8 (-rc6) boot regression on Intel desktop, does not boot after cold boots, boots after reboot"
Previous message: Hartley Sweeten: "RE: [PATCH 02/35] ARM: ep93xx: Convert remaining uses of pr_warning to pr_warn"
Next in thread: Johannes Weiner: "Re: [PATCH v2 07/10] mm, compaction: restrict async compaction to pageblocks of same migratetype"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 02/14/2017 09:10 PM, Johannes Weiner wrote:
> On Fri, Feb 10, 2017 at 06:23:40PM +0100, Vlastimil Babka wrote:
>> The migrate scanner in async compaction is currently limited to MIGRATE_MOVABLE
>> pageblocks. This is a heuristic intended to reduce latency, based on the
>> assumption that non-MOVABLE pageblocks are unlikely to contain movable pages.
>>
>> However, with the exception of THP's, most high-order allocations are not
>> movable. Should the async compaction succeed, this increases the chance that
>> the non-MOVABLE allocations will fallback to a MOVABLE pageblock, making the
>> long-term fragmentation worse.
>>
>> This patch attempts to help the situation by changing async direct compaction
>> so that the migrate scanner only scans the pageblocks of the requested
>> migratetype. If it's a non-MOVABLE type and there are such pageblocks that do
>> contain movable pages, chances are that the allocation can succeed within one
>> of such pageblocks, removing the need for a fallback. If that fails, the
>> subsequent sync attempt will ignore this restriction.
>>
>> Signed-off-by: Vlastimil Babka <vbabka@xxxxxxx>
>
> Yes, IMO we should make the async compaction scanner decontaminate
> unmovable blocks. This is because we fall back to other-typed blocks
> before we reclaim,

Which we could change too, patch 9 is a step in that direction.

> so any unmovable blocks that aren't perfectly
> occupied will fill with greedy page cache (and order-0 doesn't steal
> blocks back to make them compactable again).

order-0 allocation can actually steal the block back, the decisions to steal are
based on the order of the free pages in the fallback block, not on the
allocation order. But maybe I'm not sure what exactly you meant here.

> Subsequent unmovable
> higher-order allocations in turn are more likely to fall back and
> steal more movable blocks.

Yes.

> As long as we have vastly more movable blocks than unmovable blocks,
> continuous page cache turnover will counteract this negative trend -
> pages are reclaimed mostly from movable blocks and some unmovable
> blocks, while new cache allocations are placed into the freed movable
> blocks - slowly moving cache out from unmovable blocks into movable
> ones. But that effect is independent of the rate of higher-order
> allocations and can be overwhelmed, so I think it makes sense to
> involve compaction directly in decontamination.

Interesting observation, I agree.

> The thing I'm not entirely certain about is the aggressiveness of this
> patch. Instead of restricting the async scanner to blocks of the same
> migratetype, wouldn't it be better (in terms of allocation latency) to
> simply let it compact *all* block types?

Yes it would help allocation latency, but I'm afraid it will remove most of the
decontamination effect.

> Maybe changing it to look at
> unmovable blocks is enough to curb cross-contamination. Sure there
> will still be some, but now we're matching the decontamination rate to
> the rate of !movable higher-order allocations and don't just rely on
> the independent cache turnover rate, which during higher-order bursts
> might not be high enough to prevent an expansion of unmovable blocks.

The rate of compaction attempts is matched with allocations, but the probability
of compaction scanner being in unmovable block is low when the majority of
blocks are movable. So the decontamination rate is proportional but much smaller.

> Does that make sense?

I guess I can try and look at the stats, but I have doubts.

Thanks for the feedback!

Next message: Thomas Gleixner: "Re: v4.10-rc8 (-rc6) boot regression on Intel desktop, does not boot after cold boots, boots after reboot"
Previous message: Hartley Sweeten: "RE: [PATCH 02/35] ARM: ep93xx: Convert remaining uses of pr_warning to pr_warn"
Next in thread: Johannes Weiner: "Re: [PATCH v2 07/10] mm, compaction: restrict async compaction to pageblocks of same migratetype"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]