Re: [v2 PATCH 1/5] mm: filemap: check if THP has hwpoisoned subpage for PMD page fault

From: Kirill A. Shutemov
Date: Thu Sep 23 2021 - 10:39:06 EST


On Wed, Sep 22, 2021 at 08:28:26PM -0700, Yang Shi wrote:
> When handling shmem page fault the THP with corrupted subpage could be PMD
> mapped if certain conditions are satisfied. But kernel is supposed to
> send SIGBUS when trying to map hwpoisoned page.
>
> There are two paths which may do PMD map: fault around and regular fault.
>
> Before commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths")
> the thing was even worse in fault around path. The THP could be PMD mapped as
> long as the VMA fits regardless what subpage is accessed and corrupted. After
> this commit as long as head page is not corrupted the THP could be PMD mapped.
>
> In the regulat fault path the THP could be PMD mapped as long as the corrupted

s/regulat/regular/

> page is not accessed and the VMA fits.
>
> This loophole could be fixed by iterating every subpage to check if any
> of them is hwpoisoned or not, but it is somewhat costly in page fault path.
>
> So introduce a new page flag called HasHWPoisoned on the first tail page. It
> indicates the THP has hwpoisoned subpage(s). It is set if any subpage of THP
> is found hwpoisoned by memory failure and cleared when the THP is freed or
> split.
>
> Cc: <stable@xxxxxxxxxxxxxxx>
> Suggested-by: Kirill A. Shutemov <kirill.shutemov@xxxxxxxxxxxxxxx>
> Signed-off-by: Yang Shi <shy828301@xxxxxxxxx>
> ---

...

> diff --git a/mm/filemap.c b/mm/filemap.c
> index dae481293b5d..740b7afe159a 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -3195,12 +3195,14 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page)
> }
>
> if (pmd_none(*vmf->pmd) && PageTransHuge(page)) {
> - vm_fault_t ret = do_set_pmd(vmf, page);
> - if (!ret) {
> - /* The page is mapped successfully, reference consumed. */
> - unlock_page(page);
> - return true;
> - }
> + vm_fault_t ret = do_set_pmd(vmf, page);
> + if (ret == VM_FAULT_FALLBACK)
> + goto out;

Hm.. What? I don't get it. Who will establish page table in the pmd then?

> + if (!ret) {
> + /* The page is mapped successfully, reference consumed. */
> + unlock_page(page);
> + return true;
> + }
> }
>
> if (pmd_none(*vmf->pmd)) {
> @@ -3220,6 +3222,7 @@ static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page)
> return true;
> }
>
> +out:
> return false;
> }
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 5e9ef0fc261e..0574b1613714 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2426,6 +2426,8 @@ static void __split_huge_page(struct page *page, struct list_head *list,
> /* lock lru list/PageCompound, ref frozen by page_ref_freeze */
> lruvec = lock_page_lruvec(head);
>
> + ClearPageHasHWPoisoned(head);
> +

Do we serialize the new flag with lock_page() or what? I mean what
prevents the flag being set again after this point, but before
ClearPageCompound()?

--
Kirill A. Shutemov