Re: [PATCH V9 1/4] perf/core: Add PERF_SAMPLE_DATA_PAGE_SIZE

From: Michael Ellerman
Date: Fri Oct 09 2020 - 09:28:51 EST


Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
> On Thu, Oct 01, 2020 at 06:57:46AM -0700, kan.liang@xxxxxxxxxxxxxxx wrote:
>> +/*
>> + * Return the MMU page size of a given virtual address
>> + */
>> +static u64 __perf_get_page_size(struct mm_struct *mm, unsigned long addr)
>> +{
>> + pgd_t *pgd;
>> + p4d_t *p4d;
>> + pud_t *pud;
>> + pmd_t *pmd;
>> + pte_t *pte;
>> +
>> + pgd = pgd_offset(mm, addr);
>> + if (pgd_none(*pgd))
>> + return 0;
>> +
>> + p4d = p4d_offset(pgd, addr);
>> + if (!p4d_present(*p4d))
>> + return 0;
>> +
>> + if (p4d_leaf(*p4d))
>> + return 1ULL << P4D_SHIFT;
>> +
>> + pud = pud_offset(p4d, addr);
>> + if (!pud_present(*pud))
>> + return 0;
>> +
>> + if (pud_leaf(*pud))
>> + return 1ULL << PUD_SHIFT;
>> +
>> + pmd = pmd_offset(pud, addr);
>> + if (!pmd_present(*pmd))
>> + return 0;
>> +
>> + if (pmd_leaf(*pmd))
>> + return 1ULL << PMD_SHIFT;
>> +
>> + pte = pte_offset_map(pmd, addr);
>> + if (!pte_present(*pte)) {
>> + pte_unmap(pte);
>> + return 0;
>> + }
>> +
>> + pte_unmap(pte);
>> + return PAGE_SIZE;
>> +}
>
> So this mostly works, but gets a number of hugetlb and arch specific
> things wrong.
>
> With the first 3 patches, this is only exposed to x86 and Power.
> Michael, does the above work for you?

It might work on our server CPUs (book3s) but in general no I don't
think it will work for all powerpc.

> Looking at:
>
> arch/powerpc/include/asm/book3s/64/hugetlb.h:check_and_get_huge_psize()
>
> You seem to limit yourself to page-table sizes, however if I then look
> at the same function in:
>
> arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h
> arch/powerpc/include/asm/nohash/hugetlb-book3e.h
>
> it doesn't seem to constrain itself so.

Yeah the embedded CPUs have more weird sizes.

I think we have all the logic in our __find_linux_pte().

> Patch 4 makes it all far worse by exposing it to pretty much everybody.
>
> Now, I think we can fix at least the user mappings with the below delta,
> but if archs are using non-page-table MMU sizes we'll need arch helpers.
>
> ARM64 is in that last boat.

I think we probably need it to be weak so we can provide our own
version.

cheers