Re: [PATCH v3 0/8] try to reduce fragmenting fallbacks

From: Johannes Weiner
Date: Wed Mar 08 2017 - 12:24:53 EST


On Tue, Mar 07, 2017 at 02:15:37PM +0100, Vlastimil Babka wrote:
> Last year, Johannes Weiner has reported a regression in page mobility
> grouping [1] and while the exact cause was not found, I've come up with some
> ways to improve it by reducing the number of allocations falling back to
> different migratetype and causing permanent fragmentation.

I finally managed to get a handful of our machines on 4.10 with these
patches applied and a 4.10 vanilla control group.

The sampling period is over twelve hours, which is on the short side
for evaluating that load, so take the results with a grain of salt.

The allocstall rate (events per second) is down on average, but there
are occasionally fairly high spikes that exceed the peaks in 4.10:

http://cmpxchg.org/antifrag/allocstallrate.png

Activity from the compaction free scanner is down, while the migration
scanner does more work. I would assume most of this is coming from the
same-migratetype restriction on the source blocks:

http://cmpxchg.org/antifrag/compactfreescannedrate.png
http://cmpxchg.org/antifrag/compactmigratescannedrate.png

Unfortunately, the average compaction stall rate is consistently much
higher with the patches. The 1h rate averages are 2-3x higher:

http://cmpxchg.org/antifrag/compactstallrate.png

An increase in direct compaction is a bit worrisome, but the task
completion rates - the bottom line metric for this workload - are
still too chaotic to say whether the increased allocation latency
affects us meaningfully here. I'll give it a few more days.

Is there any other data you would like me to gather?

Thanks!