Re: Re: [PATCH v2] mm: hugetlb: add hwcrp_hugepages to record memory failure on hugetlbfs

From: Bin Wang
Date: Sun Jul 11 2021 - 10:20:16 EST


Hi Naoya,

> > Yes, splitting the huge pages and isolating a base page is ideal. And
> > we do this with dissolve_free_huge_page() when page_mapping() return
> > NULL. I think there is a reason(but I do not get it) why we don't split
> > huge pags in hugetlbfs_error_remove_page() or after. So I choose to
> > add a new count.
>
> Maybe the resource is the main reason of this incompleteness, I noticed this
> for years and continued to say "this is in my todo list", but still don't
> make it (really sorry about that...). Anyway, if you can (I hope) solve
> your problem with "/proc/kpageflag" approach, which is a recommended solution.

I do not understand the exact meaning of the "resource". I have tried to call
dissolve_free_huge_page() after hugetlbfs_error_remove_page() and it worked.
In my opinion, the error huge page has been truncated from the hugetlbfs. It
cannot be accessed and allocated again. I think it is safe to split it.

I would appreciate it if you could point out what I overlooked. And I will
try to solve it.

Thanks,
Bin Wang