Re: [PATCH v1] mm, hwpoison: fix condition in free hugetlb page path

From: Mike Kravetz
Date: Mon Dec 13 2021 - 18:38:22 EST


On 12/10/21 03:02, Naoya Horiguchi wrote:
> From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
>
> When a memory error hits a tail page of a free hugepage,
> __page_handle_poison() is expected to be called to isolate the error in
> 4kB unit, but it's not called due to the outdated if-condition in
> memory_failure_hugetlb(). This loses the chance to isolate the error in
> the finer unit, so it's not optimal. Drop the condition.
>
> This "(p != head && TestSetPageHWPoison(head)" condition is based on the
> old semantics of PageHWPoison on hugepage (where PG_hwpoison flag was
> set on the subpage), so it's not necessray any more. By getting to set
> PG_hwpoison on head page for hugepages, concurrent error events on
> different subpages in a single hugepage can be prevented by
> TestSetPageHWPoison(head) at the beginning of memory_failure_hugetlb().
> So dropping the condition should not reopen the race window originally
> mentioned in commit b985194c8c0a ("hwpoison, hugetlb:
> lock_page/unlock_page does not match for handling a free hugepage")
>
> Reported-by: Fei Luo <luofei@xxxxxxxxxxxx>
> Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> Cc: <stable@xxxxxxxxxxxxxxx> # v5.14+
> ---
> I set v5.14+ for stable trees because the base code was greatly changed
> by commit 0ed950d1f281 ("mm,hwpoison: make get_hwpoison_page() call
> get_any_page()"), and this patch is not cleanly applicable, although the
> original issue was introduced more previously.
> ---
> mm/memory-failure.c | 21 +++++++--------------
> 1 file changed, 7 insertions(+), 14 deletions(-)

Thank you Naoya!

In my spare time, I have been testing memory error code for hugetlb pages.
I noticed this issue and had created similar patch.

Reviewed-by: Mike Kravetz <mike.kravetz@xxxxxxxxxx>
--
Mike Kravetz