Re: [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset()

From: Jason Gunthorpe
Date: Tue Mar 24 2020 - 11:55:56 EST


On Tue, Mar 24, 2020 at 08:25:09AM -0700, Mike Kravetz wrote:
> On 3/24/20 4:55 AM, Jason Gunthorpe wrote:
> > Also, since CH moved all the get_user_pages_fast code out of the
> > arch's many/all archs can drop their arch specific version of this
> > routine. This is really just a specialized version of gup_fast's
> > algorithm..
> >
> > (also the arch versions seem different, why do some return actual
> > ptes, not null?)
>
> Not sure I understand that last question. The return value should be
> a *pte or null.

I mean the common code ends like this:

pmd = pmd_offset(pud, addr);
if (sz != PMD_SIZE && pmd_none(*pmd))
return NULL;
/* hugepage or swap? */
if (pmd_huge(*pmd) || !pmd_present(*pmd))
return (pte_t *)pmd;

return NULL;

So it always returns a pointer into a PUD or PMD, while say, ppc
in __find_linux_pte() ends like:

return pte_offset_kernel(&pmd, ea);

Which is pointing to a PTE

So does sparc:

pmd = pmd_offset(pud, addr);
if (pmd_none(*pmd))
return NULL;
if (is_hugetlb_pmd(*pmd))
return (pte_t *)pmd;
return pte_offset_map(pmd, addr);

Which is even worse because it is leaking a kmap..

etc

> /*
> * huge_pte_offset() - Walk the page table to resolve the hugepage
> * entry at address @addr
> *
> * Return: Pointer to page table or swap entry (PUD or PMD) for
^^^^^^^^^^^^^^^^^^^

Ie the above is not followed by the archs

I'm also scratching my head that a function that returns a pte_t *
always returns a PUD or PMD. Strange bit of type casting..

Jason