Re: [RESEND PATCH v7 00/10] Small-sized THP for anonymous memory
From: Ryan Roberts
Date: Mon Nov 27 2023 - 05:31:11 EST
On 27/11/2023 08:20, Alistair Popple wrote:
>
> David Hildenbrand <david@xxxxxxxxxx> writes:
>
>> On 24.11.23 16:53, Matthew Wilcox wrote:
>>> On Fri, Nov 24, 2023 at 04:25:38PM +0100, David Hildenbrand wrote:
>>>> On 24.11.23 16:13, Matthew Wilcox wrote:
>>>>> On Fri, Nov 24, 2023 at 09:56:37AM +0000, Ryan Roberts wrote:
>>>>>> On 23/11/2023 15:59, Matthew Wilcox wrote:
>>>>>>> On Wed, Nov 22, 2023 at 04:29:40PM +0000, Ryan Roberts wrote:
>>>>>>>> This is v7 of a series to implement small-sized THP for anonymous memory
>>>>>>>> (previously called "large anonymous folios"). The objective of this is to
>>>>>>>
>>>>>>> I'm still against small-sized THP. We've now got people asking whether
>>>>>>> the THP counters should be updated when dealing with large folios that
>>>>>>> are smaller than PMD sized. It's sowing confusion, and we should go
>>>>>>> back to large anon folios as a name.
>>>>>>
>>>>>> I suspect I'm labouring the point here, but I'd like to drill into exactly what
>>>>>> you are objecting to. Is it:
>>>>>>
>>>>>> A) Using the name "small-sized THP" (which is currently only used in the commit
>>>>>> logs and a couple of times in the documentation).
>>>>>
>>>>> Yes, this is what I'm objecting to.
>>>>
>>>> I'll just repeat that "large anon folio" is misleading, because
>>>> * we already have "large anon folios" in hugetlb
>>> We do? Where?
>>
>> MAP_PRIVATE of hugetlb. hugepage_add_anon_rmap() instantiates them.
>>
>> Hugetlb is likely one of the oldest user of compund pages aka large folios.
>
> I don't like "large anon folios" because it seems to confuse collegaues
> when explaining that large anon folios are actually smaller than the
> existing Hugetlb/THP size. I suspect this is because they already assume
> large folios are used for THP. I guess this wouldn't be an issue if
> everyone assumed THP was implemented with huge folios, but that doesn't
> seem to be the case for me at least. Likely because the default THP size
> is often 2MB, which is hardly huge.
>
>>>
>>>> * we already have PMD-sized "large anon folios" in THP
>>> Right, those are already accounted as THP, and that's what users
>>> expect.
>>> If we're allocating 1024 x 64kB chunks of memory, the user won't be able
>>> to distinguish that from 32 x 2MB chunks of memory, and yet the
>>> performance profile for some applications will be very different.
>>
>> Very right, and because there will be a difference between 1024 x
>> 64kB, 2048 x 32 kB and so forth, we need new memory stats either way.
>>
>> Ryan had some ideas on that, but currently, that's considered future
>> work, just like it likely is for the pagecache as well and needs much
>> more thoughts.
>>
>> Initially, the admin will have to enable all that for anon either
>> way. It all boils down to one memory statistic for anon memory
>> (AnonHugePages) that's messed-up already.
>>
>>>
>>>> But inn the end, I don't care how we will call this in a commit message.
>>>>
>>>> Just sticking to what we have right now makes most sense to me.
>>>>
>>>> I know, as the creator of the term "folio" you have to object :P Sorry ;)
>>> I don't care if it's called something to do with folios or not. I
>>
>> Good!
>>
>>> am objecting to the use of the term "small THP" on the grounds of
>>> confusion and linguistic nonsense.
>>
>> Maybe that's the reason why FreeBSD calls them "medium-sized
>> superpages", because "Medium-sized" seems to be more appropriate to
>> express something "in between".
>
> Transparent Medium Pages?
I don't think this is future proof; If we are going to invent a new term, it
needs to be indpendent of size to include all sizes including PMD-size and
perhaps in future, bigger-than-PMD-size. I think generalizing the meaning of
"huge" in THP to mean "bigger than the base page" is the best way to do this.
Then as David says, over time people will qualify it with a specific size when
appropriate.
>
>> So far I thought the reason was because they focused on 64k only.
>>
>> Never trust a German guy on naming suggestions. John has so far been
>> my naming expert, so I'm hoping he can help.
>
> Likewise :-)
>
>> "Sub-pmd-sized THP" is just mouthful. But then, again, this is would
>> just be a temporary name, and in the future THP will just naturally
>> come in multiple sizes (and others here seem to agree on that).
I actually don't mind "sub-pmd-sized THP" given the few locations its actually
going to live.
>>
>>
>> But just to repeat: I don't think there is need to come up with new
>> terminology and that there will be mass-confusion. So far I've not
>> heard a compelling argument besides "one memory counter could confuse
>> an admin that explicitly enables that new behavior.".
>>
>> Side note: I'm, happy that we've reached a stage where we're
>> nitpicking on names :)
>
Agreed. We are bikeshedding here. But if we really can't swallow "small-sized
THP" then perhaps the most efficient way to move this forwards is to review the
documentation (where "small-sized THP" appears twice in order to differentiate
from PMD-sized THP) - its in patch 3. Perhaps it will be easier to come up with
a good description in the context of those prose? Then once we have that,
hopefully a term will fall out that I'll update the commit logs with.