On Fri, Apr 26, 2024 at 11:53:01AM -0700, Sidhartha Kumar wrote:Besides, in a possible hugetlb demote scenario, it seems to me that we should retry
On 4/26/24 11:27 AM, Matthew Wilcox wrote:I think we should convert __get_hwpoison_page() to return either the folio
On Fri, Apr 26, 2024 at 10:57:31AM -0700, Sidhartha Kumar wrote:That would be unsafe, the safe way would be if we moved page_folio() after
On 4/26/24 10:34 AM, Matthew Wilcox wrote:See if you can find a hole in this chain of reasoning ...
On Fri, Apr 26, 2024 at 10:15:11AM -0700, Sidhartha Kumar wrote:If I moved down the folio = page_folio() line to after we verify
Use a folio in get_any_page() to save 5 calls to compound head andSo I didn't do this before because I wasn't convinced it was safe.
convert the last user of shake_page() to shake_folio(). This allows us
to remove the shake_page() definition.
We don't have a refcount on the folio, so the page might no longer
be part of this folio by the time we get the refcount on the folio.
I'd really like to see some argumentation for why this is safe.
__get_hwpoison_page() has returned 1, which indicates the reference count
was successfully incremented via foliO_try_get(), that means the folio
conversion would happen after we have a refcount. In the case we don't call
__get_hwpoison_page(), that means the MF_COUNT_INCREASED flag is set. This
means the page has existing users so that path would be safe as well. So I
think this is safe after moving page_folio() after __get_hwpoison_page().
memory_failure()
p = pfn_to_online_page(pfn);
res = try_memory_failure_hugetlb(pfn, flags, &hugetlb);
(not a hugetlb)
if (TestSetPageHWPoison(p)) {
(not already poisoned)
if (!(flags & MF_COUNT_INCREASED)) {
res = get_hwpoison_page(p, flags);
get_hwpoison_page()
ret = get_any_page(p, flags);
get_any_page()
folio = page_folio(page)
the call to __get_hw_poison() in get_any_page() and there would still be one
remaining user of shake_page() that we can't convert. A safe version of this
patch would result in a removal of one use of PageHuge() and two uses of
put_page(), would that be worth submitting?
get_any_page()
if(__get_hwpoison_page())
folio = page_folio() /* folio_try_get() returned 1, safe */
or an ERR_PTR or NULL. Also, I think we should delete the "cannot catch
tail" part and just loop in __get_hwpoison_page() until we do catch it.
See try_get_folio() in mm/gup.c for inspiration (although you can't use
it exactly because that code knows that the page is mapped into a page
table, so has a refcount).
But that's just an immediate assessment; you might find a reason that
doesn't work.