Re: [PATCH] proc: mm: export PTE sizes directly in smaps (v2)

From: Dave Hansen
Date: Mon Nov 28 2016 - 16:41:08 EST


... cc'ing the arm64 maintainers

On 11/28/2016 01:07 PM, Vlastimil Babka wrote:
> On 11/28/2016 05:52 PM, Dave Hansen wrote:
>> On 11/24/2016 06:22 AM, Vlastimil Babka wrote:
>>> On 11/17/2016 01:28 AM, Dave Hansen wrote:
>>>> @@ -702,11 +707,13 @@ static int smaps_hugetlb_range(pte_t *pt
>>>> }
>>>> if (page) {
>>>> int mapcount = page_mapcount(page);
>>>> + unsigned long hpage_size = huge_page_size(hstate_vma(vma));
>>>>
>>>> + mss->rss_pud += hpage_size;
>>>
>>> This hardcoded pud doesn't look right, doesn't the pmd/pud depend on
>>> hpage_size?
>>
>> Urg, nope. Thanks for noticing that! I think we'll need something
>> along the lines of:
>>
>> if (hpage_size == PUD_SIZE)
>> mss->rss_pud += PUD_SIZE;
>> else if (hpage_size == PMD_SIZE)
>> mss->rss_pmd += PMD_SIZE;
>
> Sounds better, although I wonder whether there are some weird arches
> supporting hugepage sizes that don't match page table levels. I recall
> that e.g. MIPS could do arbitrary size, but dunno if the kernel supports
> that...

arm64 seems to have pretty arbitrary sizes, and seems to be able to
build them out of multiple hardware PTE sizes. I think I can fix my
code to handle those:

if (hpage_size >= PGD_SIZE)
mss->rss_pgd += PGD_SIZE;
else if (hpage_size >= PUD_SIZE)
mss->rss_pud += PUD_SIZE;
else if (hpage_size >= PMD_SIZE)
mss->rss_pmd += PMD_SIZE;
else
mss->rss_pte += PAGE_SIZE;

But, I *think* that means that smaps_hugetlb_range() is *currently*
broken for these intermediate arm64 sizes. The code does:

if (mapcount >= 2)
mss->shared_hugetlb += hpage_size;
else
mss->private_hugetlb += hpage_size;

So I *think* if we may count a hugetlbfs arm64 CONT_PTES page multiple
times, and account hpage_size for *each* of the CONT_PTES. That would
artificially inflate the smaps output for those pages.

Will / Catalin, is there something I'm missing?