[PATCH v2 1/3] mm/memory-failure.c: fix race with changing page compound again

From: Miaohe Lin
Date: Sat Mar 12 2022 - 02:47:13 EST


There is a race window where we got the compound_head, the hugetlb page
could be freed to buddy, or even changed to another compound page just
before we try to get hwpoison page. Think about the below race window:
CPU 1 CPU 2
memory_failure_hugetlb
struct page *head = compound_head(p);
hugetlb page might be freed to
buddy, or even changed to another
compound page.

get_hwpoison_page -- page is not what we want now...

If this race happens, just bail out. Also MF_MSG_DIFFERENT_PAGE_SIZE is
introduced to record this event.

Signed-off-by: Miaohe Lin <linmiaohe@xxxxxxxxxx>
---
include/linux/mm.h | 1 +
include/ras/ras_event.h | 1 +
mm/memory-failure.c | 12 ++++++++++++
3 files changed, 14 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index c9bada4096ac..ef98cff2b253 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3253,6 +3253,7 @@ enum mf_action_page_type {
MF_MSG_BUDDY,
MF_MSG_DAX,
MF_MSG_UNSPLIT_THP,
+ MF_MSG_DIFFERENT_PAGE_SIZE,
MF_MSG_UNKNOWN,
};

diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
index d0337a41141c..1e694fd239b9 100644
--- a/include/ras/ras_event.h
+++ b/include/ras/ras_event.h
@@ -374,6 +374,7 @@ TRACE_EVENT(aer_event,
EM ( MF_MSG_BUDDY, "free buddy page" ) \
EM ( MF_MSG_DAX, "dax page" ) \
EM ( MF_MSG_UNSPLIT_THP, "unsplit thp" ) \
+ EM ( MF_MSG_DIFFERENT_PAGE_SIZE, "different page size" ) \
EMe ( MF_MSG_UNKNOWN, "unknown page" )

/*
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 5444a8ef4867..dabecd87ad3f 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -733,6 +733,7 @@ static const char * const action_page_types[] = {
[MF_MSG_BUDDY] = "free buddy page",
[MF_MSG_DAX] = "dax page",
[MF_MSG_UNSPLIT_THP] = "unsplit thp",
+ [MF_MSG_DIFFERENT_PAGE_SIZE] = "different page size",
[MF_MSG_UNKNOWN] = "unknown page",
};

@@ -1534,6 +1535,17 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags)
}

lock_page(head);
+
+ /**
+ * The page could have changed compound pages due to race window.
+ * If this happens just bail out.
+ */
+ if (!PageHuge(p) || compound_head(p) != head) {
+ action_result(pfn, MF_MSG_DIFFERENT_PAGE_SIZE, MF_IGNORED);
+ res = -EBUSY;
+ goto out;
+ }
+
page_flags = head->flags;

if (hwpoison_filter(p)) {
--
2.23.0