Re: [PATCH] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
From: Dev Jain
Date: Thu Jun 25 2026 - 01:06:24 EST
On 25/06/26 10:12 am, Andrew Morton wrote:
> On Thu, 25 Jun 2026 04:28:51 +0000 Dev Jain <dev.jain@xxxxxxx> wrote:
>
>> try_to_unmap_one() handles hugetlb folios when memory failure needs
>> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
>> case page_vma_mapped_walk() returns the hugetlb entry in pvmw.pte, but
>> the code reads it with ptep_get() before decoding the PFN.
>>
>> That is wrong on architectures where hugetlb entries are not encoded as
>> regular PTEs. On s390, for example, a raw huge RSTE must be converted
>> by huge_ptep_get() before helpers such as pte_pfn() can inspect it. A
>> raw decode can select the wrong subpage, so try_to_unmap_one() can
>> install a hwpoison entry for the wrong PFN.
>>
>> The userspace-visible result is that a later access to the poisoned
>> hugetlb subpage can miss the expected SIGBUS. With DEBUG_VM, the wrong
>> subpage can also trip the PageHWPoison check.
>>
>> Use huge_ptep_get() for hugetlb mappings before decoding the PFN.
>>
>> Before c7ab0d2fdc84, the bug existed in the form of a plain dereference:
>> we would check the head page pfn of the hugetlb with pte_pfn(*pte), and
>> bail out on mismatch. This would mean that the hwpoisoned entry will not
>> get installed.
>>
>> I am not sure what is the procedure on such kinds of very old bugs - how
>> back should I really go?
>
> I think 9 years is enough ;)
>
>> There are similar old bugs present, in try_to_migrate_one(), check_pte(),
>> remove_migration_pte(), prot_none_hugetlb_entry().
>
> Why now? Was there some more recent (s390?) change which exposed this?
I was refactoring the hugetlb bits in try_to_unmap_one, so the bug got
caught in review by David (which reminds me to put a "Reported-by" tag
on this patch).
I guess if someone would run hugetlb-read-hwpoison.c on s390, this would
be caught. Turns out, this selftest is in a category of "destructive tests"
in run_vmtests.sh, so ./run_vmtests.sh or even ./run_vmtests.sh -a won't
run this. We are supposed to run this with ./run_vmtests.sh -d, and that
option was broken until one month ago, see 3432cbb291aa. So essentially
no one has been running that test.
>