Re: [PATCH] mm: incorporate read-only pages into transparent huge pages

From: Rik van Riel
Date: Fri Jan 23 2015 - 14:04:39 EST


On 01/23/2015 02:47 AM, Ebru Akagunduz wrote:

> @@ -2169,7 +2169,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
> VM_BUG_ON_PAGE(!PageSwapBacked(page), page);
>
> /* cannot use mapcount: can't collapse if there's a gup pin */
> - if (page_count(page) != 1)
> + if (page_count(page) != 1 + !!PageSwapCache(page))
> goto out;
> /*
> * We can do it before isolate_lru_page because the
> @@ -2179,6 +2179,17 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
> */
> if (!trylock_page(page))
> goto out;
> + if (!pte_write(pteval)) {
> + if (PageSwapCache(page) && !reuse_swap_page(page)) {
> + unlock_page(page);
> + goto out;
> + }
> + /*
> + * Page is not in the swap cache, and page count is
> + * one (see above). It can be collapsed into a THP.
> + */
> + }

Andrea pointed out a bug between the above two parts of
the patch.

In-between where we check page_count(page), and where we
check whether the page got added to the swap cache, the
page count may change, causing us to get into a race
condition with get_user_pages_fast, the pageout code, etc.

It is necessary to check the page count again right after
the trylock_page(page) above, to make sure it was not changed
while the page was not yet locked.

That second check should have a comment explaining that
the first "page_count(page) != 1 + !!PageSwapCache(page)"
check could be unsafe due to the page not yet locked,
so the check needs to be repeated. Maybe something along
the lines of:

/* Re-check the page count with the page locked */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/