Re: [PATCH] mm/rmap: use huge_ptep_get() in try_to_unmap_one()

From: Dev Jain

Date: Thu Jun 25 2026 - 04:04:07 EST




On 25/06/26 1:26 pm, David Hildenbrand (Arm) wrote:
> On 6/25/26 06:28, Dev Jain wrote:
>> try_to_unmap_one() handles hugetlb folios when memory failure needs
>> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
>> case page_vma_mapped_walk() returns the hugetlb entry in pvmw.pte, but
>> the code reads it with ptep_get() before decoding the PFN.
>>
>> That is wrong on architectures where hugetlb entries are not encoded as
>> regular PTEs. On s390, for example, a raw huge RSTE must be converted
>> by huge_ptep_get() before helpers such as pte_pfn() can inspect it. A
>> raw decode can select the wrong subpage, so try_to_unmap_one() can
>> install a hwpoison entry for the wrong PFN.
>>
>> The userspace-visible result is that a later access to the poisoned
>> hugetlb subpage can miss the expected SIGBUS. With DEBUG_VM, the wrong
>> subpage can also trip the PageHWPoison check.
>>
>> Use huge_ptep_get() for hugetlb mappings before decoding the PFN.
>>
>> Before c7ab0d2fdc84, the bug existed in the form of a plain dereference:
>> we would check the head page pfn of the hugetlb with pte_pfn(*pte), and
>> bail out on mismatch. This would mean that the hwpoisoned entry will not
>> get installed.
>>
>> I am not sure what is the procedure on such kinds of very old bugs - how
>> back should I really go?
>>
>> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
>> Cc: stable@xxxxxxxxxxxxxxx
>> Signed-off-by: Dev Jain <dev.jain@xxxxxxx>
>> ---
>> Applies on mm-unstable (d17fe8a046a2).
>> There are similar old bugs present, in try_to_migrate_one(), check_pte(),
>> remove_migration_pte(), prot_none_hugetlb_entry().
>
> Yeah, we should handle all these cases properly. Can you send fixes?
>
> Using ptep_get() on something that's not a PTE entry is shaky on some architectures.

I can send the fixes blaming the commit till which backport is relatively simple. The bug will
still remain before that, where we don't even do ptep_get(), just a plain dereference, if
that is fine. Probably no one is running pre-2017 kernels.

>