Re: [PATCH v4 1/2] mm/memory-failure: fix missing ->mf_stats count in hugetlb poison
From: jane . chu
Date: Wed Jan 14 2026 - 13:16:26 EST
On 1/14/2026 7:37 AM, Chris Mason wrote:
On Tue, 13 Jan 2026 01:07:50 -0700 Jane Chu <jane.chu@xxxxxxxxxx> wrote:
When a newly poisoned subpage ends up in an already poisoned hugetlb
folio, 'num_poisoned_pages' is incremented, but the per node ->mf_stats
is not. Fix the inconsistency by designating action_result() to update
them both.
While at it, define __get_huge_page_for_hwpoison() return values in terms
of symbol names for better readibility. Also rename
folio_set_hugetlb_hwpoison() to hugetlb_update_hwpoison() since the
function does more than the conventional bit setting and the fact
three possible return values are expected.
Hi everyone,
I ran mm-new through the AI patch review prompts, and this commit was flagged.
The review below looks right to me:
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
[ ... ]
@@ -2029,22 +2037,29 @@ static int try_memory_failure_hugetlb(unsigned long pfn, int flags, int *hugetlb)
*hugetlb = 1;
retry:
res = get_huge_page_for_hwpoison(pfn, flags, &migratable_cleared);
- if (res == 2) { /* fallback to normal page handling */
+ switch (res) {
+ case -EINVAL: /* fallback to normal page handling */
*hugetlb = 0;
return 0;
- } else if (res == -EHWPOISON) {
- if (flags & MF_ACTION_REQUIRED) {
- folio = page_folio(p);
- res = kill_accessing_process(current, folio_pfn(folio), flags);
- }
- action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
- return res;
- } else if (res == -EBUSY) {
+ case -EBUSY:
if (!(flags & MF_NO_RETRY)) {
flags |= MF_NO_RETRY;
goto retry;
}
return action_result(pfn, MF_MSG_GET_HWPOISON, MF_IGNORED);
+ case MF_HUGETLB_FOLIO_PRE_POISONED:
+ case MF_HUGETLB_PAGE_PRE_POISON:
+ if (flags & MF_ACTION_REQUIRED) {
+ folio = page_folio(p);
+ res = kill_accessing_process(current, folio_pfn(folio), flags);
+ }
+ if (res == MF_HUGETLB_FOLIO_PRE_POISONED)
When MF_ACTION_REQUIRED is set, kill_accessing_process() overwrites res
before this check. kill_accessing_process() returns either 0 or
-EHWPOISON, so the check for MF_HUGETLB_FOLIO_PRE_POISONED (which is 3)
will always be false in that path.
This means action_result() will be called with MF_MSG_HUGE instead of
MF_MSG_ALREADY_POISONED. Looking at action_result():
if (type != MF_MSG_ALREADY_POISONED && type != MF_MSG_PFN_MAP) {
num_poisoned_pages_inc(pfn);
update_per_node_mf_stats(pfn, result);
}
Does this cause incorrect counter increments for already-poisoned pages
when MF_ACTION_REQUIRED is set? The original code called action_result()
unconditionally with MF_MSG_ALREADY_POISONED regardless of whether
kill_accessing_process() was called.
Should a separate variable preserve the original res value before the
call to kill_accessing_process()?
Good catch! William Roche has also pointed out the same issue to me in private. I will fix it in v5 soon.
thanks,
-jane
+ action_result(pfn, MF_MSG_ALREADY_POISONED, MF_FAILED);
+ else
+ action_result(pfn, MF_MSG_HUGE, MF_FAILED);
+ return res;