Re: [PATCH 1/1] mm: prevent poison consumption when splitting THP

From: Lance Yang

Date: Tue Sep 30 2025 - 06:20:48 EST




On 2025/9/30 18:13, Lance Yang wrote:


On 2025/9/30 16:53, David Hildenbrand wrote:
On 30.09.25 03:48, Lance Yang wrote:
On Tue, Sep 30, 2025 at 3:07 AM David Hildenbrand <david@xxxxxxxxxx> wrote:

On 29.09.25 18:30, Zhuo, Qiuxu wrote:
Hi Tony,

From: Luck, Tony <tony.luck@xxxxxxxxx>
[...]
Subject: RE: [PATCH 1/1] mm: prevent poison consumption when splitting THP

Miaohe mentioned in another e-mail that there was an HWPoisoned flag
for the raw error 4K page.
We could use that flag just to skip that raw error page and still use
the zeropage for other healthy sub-pages. I'll try that.

That HWPoisoned flag is only set for raw pages where an error has been
detected. Maybe Linux could implement an
"is_this_page_all_zero_mc_safe()"[1] that would catch undetected poison

This sounds like a great suggestion to me.
Let's see what others think about this and the name (though the name already LGTM 😊).

The function name is just ... special. Not the good type of special IMHO. :)

Note that we'll be moving to pages_identical() in [1]. Maybe we would
want a pages_identical_mc() or sth. like that as a follow up later.


So in any case, make that a follow-up work on top of a simple fix.

Yeah. IIRC, as David suggested earlier, we can just check if a page is
poisoned using PageHWPoison().

Perhaps we should move this check into pages_identical()? This would make
it a central place to determine if pages are safe to access and merge ;)

I would have to go into memcmp_pages(). Would be an option, but not sure if we should rather let callers deal with that.

For example, in some cases it might be sufficient to just check if the large folio has any poisoned page and give up early.

FWIW, one idea I had was to create a unified pre-flight checker, like
folio_pages_identical_prepare(struct folio *folio). A caller could use
it before a loop of pages_identical() calls to pre-check a folio :)

Forgot to add:

It would centralize all folio-level checks.

So if we ever need a new check in the future, we'd only modify the
prepare helper, not all the individual callers.