Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb

From: David Hildenbrand (Arm)

Date: Mon Jun 29 2026 - 03:32:36 EST


On 6/29/26 08:48, Dev Jain wrote:
>
>
> On 29/06/26 12:09 pm, David Hildenbrand (Arm) wrote:
>> On 6/28/26 07:44, Lance Yang wrote:
>>>
>>> [...]
>>>
>>> Yes, that's what I had in mind :) thanks!
>>>
>>>
>>> Maybe worth spelling out the rule as well:
>>>
>>> For arch helpers that use addr, huge_ptep_get() assumes addr is the
>>> address for the hugetlb entry ptep points to. arm64 already makes that
>>> assumption.
>>>
>>> Callers where addr may not be hugepage-aligned should use
>>> hugetlb_ptep_get() instead.
>>
>> Do we have any examples where code would do that? I would think that all code
>> must properly align addr ahead of times.
>
> Sashiko notes other places:
>
> https://sashiko.dev/#/patchset/20260625112955.3254283-1-dev.jain%40arm.com

Yeah, that looks shaky. We do seem to have a bunch of these cases, primarily
from pagewalk code (where some users like pagemap need the actual address).

I think we have two options

1) To prevent any (further) issues, make huge_ptep_get() always consume the
hstate, and let the arch code deal with aligning it. Invasive.

2) Make the arch code handle aligning without the hstate.

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 30772a909aea3..303a1b74796c9 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -126,6 +126,9 @@ pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
return orig_pte;

ncontig = find_num_contig(mm, addr, ptep, &pgsize);
+ ptep = PTR_ALIGN_DOWN(ptep, sizeof(*ptep) * ncontig);
+ orig_pte = __ptep_get(ptep);
+
for (i = 0; i < ncontig; i++, ptep++) {
pte_t pte = __ptep_get(ptep);

(nshift/order instead of ncontig might avoid a multiplication, but not sure if that matters in practice)

IIUC, that's similar to what huge_ptep_get() does on ppc.


static inline pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
{
if (ptep_is_8m_pmdp(mm, addr, ptep))
ptep = pte_offset_kernel((pmd_t *)ptep, ALIGN_DOWN(addr, SZ_8M));
return ptep_get(ptep);
}

I'd assume we could do the same on riscv. Besides that, I don't think any arch has cont
entries.


Interestingly, huge_pte_clear() / huge_ptep_get_and_clear() and friends would be all
wrong when the wrong address is passed. But that code really is called from hugetlb.c
where we should take better care of that. (e.g., partially zapping a hugetlb page is not
possible)

--
Cheers,

David