Re: Question about hwpoison handling of 1GB hugepage
From: HORIGUCHI NAOYA(堀口 直也)
Date: Thu Jun 02 2022 - 01:14:53 EST
On Thu, May 19, 2022 at 02:17:58AM +0000, HORIGUCHI NAOYA(堀口 直也) wrote:
> On Tue, May 17, 2022 at 08:03:03PM +0800, Liu Shixin wrote:
> > Hello Naoya,
> >
> > Is there any progress on memory error handling on 1GB hugepage : )
>
> Hi Shixin,
>
> I have a little ..., I found that error handling fails for anonymous 1GB
> hugepage because __page_handle_poison() fails. I don't pinpoint the issue
> precisely yet, but I feel that there's some issue in free_gigantic_page()
> that fails to send the victim raw page to buddy. I don't think that this is
> an critical issue because the error page should not be reused (it's isolated
> but not in controlled manner). This prevents unpoisoning and make testing
> inefficient, so I'd like to fix.
I posted a patchset enabling 1GB hugepage support,
https://lore.kernel.org/linux-mm/20220602050631.771414-1-naoya.horiguchi@xxxxxxxxx/T/#u
It passed my testing but I appreciate it if you try testing it
in your workload.
Thanks,
Naoya Horiguchi
>
> >
> > Thanks,
> > Liu Shixin
> >
> > On 2022/4/4 7:42, HORIGUCHI NAOYA(堀口 直也) wrote:
> > > On Thu, Mar 31, 2022 at 06:56:25PM +0800, Liu Shixin wrote:
> > >> Hi,
> > >>
> > >> Recently, I found a problem with hwpoison 1GB hugepage.
> > >> I created a process and mapped 1GB hugepage. This process will then fork a
> > >> child process and write/read this 1GB hugepage. Then I inject hwpoison into
> > >> this 1GB hugepage. The child process triggers the memory failure and is
> > >> being killed as expected. After this, the parent process will try to fork a
> > >> new child process and do the same thing. It is killed again and finally it
> > >> goes into such an infinite loop. I found this was caused by
> > >> commit 31286a8484a8 ("mm: hwpoison: disable memory error handling on 1GB hugepage")
> > > Hello Shixin,
> > >
> > > It's little unclear to me about what behavior you are expecting and
> > > what the infinite loop repeats, could you explain little more about them?
> > > (I briefly tried to reproduce it, but didn't make it...)
> > >
> > >> It looks like there is a bug for hwpoison 1GB hugepage so I try to reproduce
> > >> the bug described. After trying to revert the patch in an earlier version of
> > >> the kernel, I reproduce the bug described. Then I try to revert the patch in
> > >> latest version, and find the bug is no longer reproduced.
> > >>
> > >> I compare the code paths of 1 GB hugepage and 2 MB hugepage for second madvise(MADV_HWPOISON),
> > >> and find that the problem is caused because in gup_pud_range(), pud_none() and
> > >> pud_huge() both return false and then trigger the bug. But in gup_pmd_range(),
> > >> the pmd_none() is modified to pmd_present() which will make code return directly.
> > >> The I find that it is commit 15494520b776 ("mm: fix gup_pud_range") which
> > >> cause latest version not reproduced. I backport commit 15494520b776 in
> > >> earlier version and find the bug is no longer reproduced either.
> > > Thank you for the analysis.
> > > So this patch might make 31286a8484a8 unnecessary, that's a good news.
> > >
> > >> So I'd like to consult that is it the time to revert commit 31286a8484a8?
> > >> Or if we modify pud_huge to be similar with pmd_huge, is it sufficient?
> > >>
> > >> I also noticed there is a TODO comment in memory_failure_hugetlb():
> > >> - conversion of a pud that maps an error hugetlb into hwpoison
> > >> entry properly works, and
> > >> - other mm code walking over page table is aware of pud-aligned
> > >> hwpoison entries.
> > > These are simply minimum requirements, but might not be sufficient.
> > > We need testing (with removing 31286a8484a8) to make sure that
> > > there's no issues on some corner cases.
> > > (I start to extend existing hugetlb-related testcases to 1GB ones.)
> > >
> > > Thanks,
> > > Naoya Horiguchi
> > >
> > >> I'm not sure whether the above fix are sufficient, so is there anything else need
> > >> to analysis that I haven't considered?