Re: [PATCH v5 14/14] mm, compaction: try to capture the just-created high-order freepage

From: Vlastimil Babka
Date: Wed Jul 30 2014 - 05:57:08 EST


On 07/30/2014 10:39 AM, Joonsoo Kim wrote:
On Tue, Jul 29, 2014 at 05:34:37PM +0200, Vlastimil Babka wrote:
Could do it in isolate_migratepages() for whole pageblocks only (as
David's patch did), but that restricts the usefulness. Or maybe do
it fine grained by calling isolate_migratepages_block() multiple
times. But the overhead of multiple calls would probably suck even
more for lower-order compactions. For CMA the added overhead is
basically only checks for next_capture_pfn that will be always
false, so predictable. And mostly just in branches where isolation
is failing, which is not the CMA's "fast path" I guess?

You can do it find grained with compact_control's migratepages list
or new private list. If some pages are isolated and added to this list,
you can check pfn of page on this list and determine appropriate capture
candidate page. This approach can give us more flexibility for
choosing capture candidate without adding more complexity to
common function. For example, you can choose capture candidate if
there are XX isolated pages in certain range.

Hm I see. But the logic added by page capture was also a prerequisity for the "[RFC PATCH V4 15/15] mm, compaction: do not migrate pages when that cannot satisfy page fault allocation"
http://marc.info/?l=linux-mm&m=140551859423716&w=2

And that could be hardly done by a post-isolation inspection of the migratepages list. And I haven't given up on that idea yet :)

In __isolate_free_page(), we check zone_watermark_ok() with order 0.
But normal allocation logic would check zone_watermark_ok() with requested
order. Your capture logic uses __isolate_free_page() and it would
affect compaction success rate significantly. And it means that
capture logic allocates high order page on page allocator
too aggressively compared to other component such as normal high order

It's either that, or the extra lru drain that makes the different.
But the "aggressiveness" would in fact mean better accuracy.
Watermark checking may be inaccurate. Especially when memory is
close to the watermark and there is only a single high-order page
that would satisfy the allocation.

If this "aggressiveness" means better accuracy, fixing general
function, watermark_ok() is better than adding capture logic.

That's if fixing the function wouldn't add significant overhead to all the callers. And making it non-racy and not prone to per-cpu counter drifts would certainly do that :(

But, I guess that there is a reason that watermark_ok() is so
conservative. If page allocator aggressively provides high order page,
future atomic high order page request cannot succeed easily. For
preventing this situation, watermark_ok() should be conservative.

I don't think it's intentionally conservative, just unreliable. It tests two things together:

1) are there enough free pages for the allocation wrt watermarks?
2) does it look like that there is a free page of the requested order?

The 1) works fine and my patch won't change that by passing a order=0. The problem is with 2) which is unreliable, especially when close to the watermarks. Note that it's not trying to keep some reserves for atomic requests. That's what MIGRATE_RESERVE is for. It's just unreliable to decide if there is the high-order page available. Even though its allocation would preserve the watermarks, so there is no good reason to prevent the allocation. So it will often pass when deciding to stop compaction, and then fail when allocating.

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/