Re: [PATCH v2] mm/hugetlb: fix a addressing exception caused by huge_pte_offset()

From: Mike Kravetz
Date: Tue Mar 24 2020 - 15:47:28 EST


On 3/24/20 10:59 AM, Jason Gunthorpe wrote:
> On Tue, Mar 24, 2020 at 09:19:29AM -0700, Mike Kravetz wrote:
>> On 3/24/20 8:55 AM, Jason Gunthorpe wrote:
>>> On Tue, Mar 24, 2020 at 08:25:09AM -0700, Mike Kravetz wrote:
>>>> On 3/24/20 4:55 AM, Jason Gunthorpe wrote:
>>>>> Also, since CH moved all the get_user_pages_fast code out of the
>>>>> arch's many/all archs can drop their arch specific version of this
>>>>> routine. This is really just a specialized version of gup_fast's
>>>>> algorithm..
>>>>>
>>>>> (also the arch versions seem different, why do some return actual
>>>>> ptes, not null?)
>>>>
>>>> Not sure I understand that last question. The return value should be
>>>> a *pte or null.
>>>
>>> I mean the common code ends like this:
>>>
>>> pmd = pmd_offset(pud, addr);
>>> if (sz != PMD_SIZE && pmd_none(*pmd))
>>> return NULL;
>>> /* hugepage or swap? */
>>> if (pmd_huge(*pmd) || !pmd_present(*pmd))
>>> return (pte_t *)pmd;
>>>
>>> return NULL;
>>>
>>> So it always returns a pointer into a PUD or PMD, while say, ppc
>>> in __find_linux_pte() ends like:
>>>
>>> return pte_offset_kernel(&pmd, ea);
>>>
>>> Which is pointing to a PTE
>>
>> Ok, now I understand the question. huge_pte_offset will/should only be
>> called for addresses that are in a vma backed by hugetlb pages. So,
>> pte_offset_kernel() will only return page table type (PUD/PMD/etc) associated
>> with a huge page supported by the particular arch.
>
> I thought pte_offset_kernel always returns PTEs (ie the 4k entries on
> x86), I suppose what you are saying is that since the caller knows
> this is always a PUD or PMD due to the VMA the pte_offset is dead code.

Yes, for x86 the address will correspond to a PUD or PMD or NULL. For huge
page mappings/vmas on x86, there are no corresponding PTEs.
--
Mike Kravetz