Re: [PATCH v7 14/14] mm,hwpoison: Try to narrow window race for free pages

From: HORIGUCHI NAOYA(堀口 直也)
Date: Wed Sep 23 2020 - 03:40:28 EST


On Tue, Sep 22, 2020 at 03:56:50PM +0200, Oscar Salvador wrote:
> Aristeu Rozanski reported that a customer test case started
> to report -EBUSY after the hwpoison rework patchset.
>
> There is a race window between spotting a free page and taking it off
> its buddy freelist, so it might be that by the time we try to take it off,
> the page has been already allocated.
>
> This patch tries to handle such race window by trying to handle the new
> type of page again if the page was allocated under us.
>
> Signed-off-by: Oscar Salvador <osalvador@xxxxxxx>
> Reported-by: Aristeu Rozanski <aris@xxxxxxxxx>
> Tested-by: Aristeu Rozanski <aris@xxxxxxxxx>

Acked-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>

> ---
> mm/memory-failure.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 46b1821d2817..8f23d3c7a0a2 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1903,6 +1903,7 @@ int soft_offline_page(unsigned long pfn, int flags)
> {
> int ret;
> struct page *page;
> + bool try_again = true;
>
> if (!pfn_valid(pfn))
> return -ENXIO;
> @@ -1918,6 +1919,7 @@ int soft_offline_page(unsigned long pfn, int flags)
> return 0;
> }
>
> +retry:
> get_online_mems();
> ret = get_any_page(page, pfn, flags);
> put_online_mems();
> @@ -1925,7 +1927,10 @@ int soft_offline_page(unsigned long pfn, int flags)
> if (ret > 0)
> ret = soft_offline_in_use_page(page);
> else if (ret == 0)
> - ret = soft_offline_free_page(page);
> + if (soft_offline_free_page(page) && try_again) {
> + try_again = false;
> + goto retry;
> + }
>
> return ret;
> }
> --
> 2.26.2
>