Re: [RFC 1/2] mm: disable LRU pagevec during the migration temporarily

From: Matthew Wilcox
Date: Tue Feb 16 2021 - 13:24:15 EST


On Tue, Feb 16, 2021 at 09:03:47AM -0800, Minchan Kim wrote:
> LRU pagevec holds refcount of pages until the pagevec are drained.
> It could prevent migration since the refcount of the page is greater
> than the expection in migration logic. To mitigate the issue,
> callers of migrate_pages drains LRU pagevec via migrate_prep or
> lru_add_drain_all before migrate_pages call.
>
> However, it's not enough because pages coming into pagevec after the
> draining call still could stay at the pagevec so it could keep
> preventing page migration. Since some callers of migrate_pages have
> retrial logic with LRU draining, the page would migrate at next trail
> but it is still fragile in that it doesn't close the fundamental race
> between upcoming LRU pages into pagvec and migration so the migration
> failure could cause contiguous memory allocation failure in the end.

Have you been able to gather any numbers on this? eg does migration
now succeed 5% more often?