Re: [RFC PATCH 1/8] mm: Provide pagesize to pmd_populate()

From: Christophe Leroy
Date: Sun May 26 2024 - 05:29:19 EST




Le 25/03/2024 à 17:19, Jason Gunthorpe a écrit :
> On Mon, Mar 25, 2024 at 03:55:54PM +0100, Christophe Leroy wrote:
>> Unlike many architectures, powerpc 8xx hardware tablewalk requires
>> a two level process for all page sizes, allthough second level only
>> has one entry when pagesize is 8M.
>>
>> To fit with Linux page table topology and without requiring special
>> page directory layout like hugepd, the page entry will be replicated
>> 1024 times in the standard page table. However for large pages it is
>> necessary to set bits in the level-1 (PMD) entry. At the time being,
>> for 512k pages the flag is kept in the PTE and inserted in the PMD
>> entry at TLB miss exception, that is necessary because we can have
>> pages of different sizes in a page table. However the 12 PTE bits are
>> fully used and there is no room for an additional bit for page size.
>>
>> For 8M pages, there will be only one page per PMD entry, it is
>> therefore possible to flag the pagesize in the PMD entry, with the
>> advantage that the information will already be at the right place for
>> the hardware.
>>
>> To do so, add a new helper called pmd_populate_size() which takes the
>> page size as an additional argument, and modify __pte_alloc() to also
>> take that argument. pte_alloc() is left unmodified in order to
>> reduce churn on callers, and a pte_alloc_size() is added for use by
>> pte_alloc_huge().
>>
>> When an architecture doesn't provide pmd_populate_size(),
>> pmd_populate() is used as a fallback.
>
> I think it would be a good idea to document what the semantic is
> supposed to be for sz?
>
> Just a general remark, probably nothing for this, but with these new
> arguments the historical naming seems pretty tortured for
> pte_alloc_size().. Something like pmd_populate_leaf(size) as a naming
> scheme would make this more intuitive. Ie pmd_populate_leaf() gives
> you a PMD entry where the entry points to a leaf page table able to
> store folios of at least size.

I removed patches 1 and 2 and now add bit _PMD_PAGE_8M in PMD entry
afterwards in set_huge_pte_at()

>
> Anyhow, I thought the edits to the mm helpers were fine, certainly
> much nicer than hugepd. Do you see a path to remove hugepd entirely
> from here?
>
> Thanks,
> Jason