Re: [patch v2] mm, compaction: avoid isolating pinned pages
From: Hugh Dickins
Date: Wed Feb 05 2014 - 15:57:37 EST
On Tue, 4 Feb 2014, David Rientjes wrote:
> Page migration will fail for memory that is pinned in memory with, for
> example, get_user_pages(). In this case, it is unnecessary to take
> zone->lru_lock or isolating the page and passing it to page migration
> which will ultimately fail.
>
> This is a racy check, the page can still change from under us, but in
> that case we'll just fail later when attempting to move the page.
>
> This avoids very expensive memory compaction when faulting transparent
> hugepages after pinning a lot of memory with a Mellanox driver.
>
> On a 128GB machine and pinning ~120GB of memory, before this patch we
> see the enormous disparity in the number of page migration failures
> because of the pinning (from /proc/vmstat):
>
> compact_pages_moved 8450
> compact_pagemigrate_failed 15614415
>
> 0.05% of pages isolated are successfully migrated and explicitly
> triggering memory compaction takes 102 seconds. After the patch:
>
> compact_pages_moved 9197
> compact_pagemigrate_failed 7
>
> 99.9% of pages isolated are now successfully migrated in this
> configuration and memory compaction takes less than one second.
>
> Signed-off-by: David Rientjes <rientjes@xxxxxxxxxx>
> ---
> v2: address page count issue per Joonsoo
>
> mm/compaction.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/mm/compaction.c b/mm/compaction.c
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -578,6 +578,15 @@ isolate_migratepages_range(struct zone *zone, struct compact_control *cc,
> continue;
> }
>
> + /*
> + * Migration will fail if an anonymous page is pinned in memory,
> + * so avoid taking lru_lock and isolating it unnecessarily in an
> + * admittedly racy check.
> + */
> + if (!page_mapping(page) &&
> + page_count(page) > page_mapcount(page))
> + continue;
> +
> /* Check if it is ok to still hold the lock */
> locked = compact_checklock_irqsave(&zone->lru_lock, &flags,
Much better, maybe good enough as an internal patch to fix a particular
problem you're seeing; but not yet good enough to go upstream.
Anonymous pages are not the only pages which might be pinned,
and your test doesn't mention PageAnon, so does not match your comment.
I've remembered is_page_cache_freeable() in mm/vmscan.c, which gives
more assurance that a page_count - page_has_private test is appropriate,
whatever the filesystem and migrate method to be used.
So I think the test you're looking for is
pincount = page_count(page) - page_mapcount(page);
if (page_mapping(page))
pincount -= 1 + page_has_private(page);
if (pincount > 0)
continue;
but please cross-check and test that out, it's easy to be off-by-one etc.
For a moment I thought a PageWriteback test would be useful too,
but no, that should already appear in the pincount.
Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/