[PATCH v8 1/4] mm,hwpoison,hugetlb,memory_hotplug: hotremove memory section with hwpoisoned hugepage

From: Naoya Horiguchi
Date: Tue Oct 25 2022 - 01:36:15 EST


On Tue, Oct 25, 2022 at 10:38:11AM +0800, Miaohe Lin wrote:
> On 2022/10/24 14:20, Naoya Horiguchi wrote:
> > From: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> >
> > HWPoisoned page is not supposed to be accessed once marked, but currently
> > such accesses can happen during memory hotremove because do_migrate_range()
> > can be called before dissolve_free_huge_pages() is called.
> >
> > Clear HPageMigratable for hwpoisoned hugepages to prevent them from being
> > migrated. This should be done in hugetlb_lock to avoid race against
> > isolate_hugetlb().
> >
> > get_hwpoison_huge_page() needs to have a flag to show it's called from
> > unpoison to take refcount of hwpoisoned hugepages, so add it.
> >
> > Reported-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>
> > Signed-off-by: Naoya Horiguchi <naoya.horiguchi@xxxxxxx>
> > Reviewed-by: Oscar Salvador <osalvador@xxxxxxx>
> > Reviewed-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>
> > ---
> > ChangeLog v3 -> v7:
> > - introduce TESTCLEARHPAGEFLAG() to determine the value of migratable_cleared
>
> Many thanks for update, Naoya. I'm sorry but TestClearHPageMigratable() might be somewhat
> overkill. As we discussed in previous thread:
>
> """
> I think I might be nitpicking... But it seems ClearHPageMigratable is not enough here.
> 1. In MF_COUNT_INCREASED case, we don't know whether HPageMigratable is set.
> 2. Even if HPageMigratable is set, there might be a race window before we clear HPageMigratable?
> So "*migratable_cleared = TestClearHPageMigratable" might be better? But I might be wrong.
> """
>
> The case 2 should be a dumb problem(sorry about it). HPageMigratable() is always cleared while holding
> the hugetlb_lock which is already held by get_huge_page_for_hwpoison(). So the only case we should care
> about is case 1 and that can be handled by below more efficient pattern:
> if (HPageMigratable)
> ClearHPageMigratable()
>
> So the overhead of test and clear atomic ops can be avoided. But this is trival.
>
> Anyway, this patch still looks good to me. And my Reviewed-by tag still applies. Many thanks.

OK, so I replace this 1/4 with the following one, thank you.

- Naoya Horiguchi
---