Re: [RFC PATCH 5/5] mm: shmem: add anonymous share mTHP counters
From: Ryan Roberts
Date: Tue Apr 23 2024 - 07:40:14 EST
On 23/04/2024 02:46, Baolin Wang wrote:
>
>
> On 2024/4/23 09:17, Barry Song wrote:
>> On Mon, Apr 22, 2024 at 3:03 PM Baolin Wang
>> <baolin.wang@xxxxxxxxxxxxxxxxx> wrote:
>>>
>>> Signed-off-by: Baolin Wang <baolin.wang@xxxxxxxxxxxxxxxxx>
>>> ---
>>> include/linux/huge_mm.h | 2 ++
>>> mm/huge_memory.c | 4 ++++
>>> mm/shmem.c | 5 ++++-
>>> 3 files changed, 10 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
>>> index 26b6fa98d8ac..67b9c1acad31 100644
>>> --- a/include/linux/huge_mm.h
>>> +++ b/include/linux/huge_mm.h
>>> @@ -270,6 +270,8 @@ enum mthp_stat_item {
>>> MTHP_STAT_ANON_SWPOUT,
>>> MTHP_STAT_ANON_SWPOUT_FALLBACK,
>>> MTHP_STAT_ANON_SWPIN_REFAULT,
>>> + MTHP_STAT_SHMEM_ANON_ALLOC,
>>> + MTHP_STAT_SHMEM_ANON_ALLOC_FALLBACK,
>>
>> not quite sure about this. for 2MB pmd-mapped THP shmem, we count them
>> as FILE_THP.
>> here we are counting as SHMEM_ANON. To me, SHMEM_ANON is more correct but
>> it doesn't align with pmd-mapped THP. David, Ryan, what do you think?
>
> Thanks for reviewing.
>
> IMO, I think both approaches are acceptable, which also reflects the dual nature
> of anonymous shared pages: on the one hand they are anonymous pages, and on the
> other hand, they are backed by a pseudo file. From the user's perspective, I
> prefer to use the term "anonymous shmem", which can be distinguished from the
> real file-backed THP.
>
> Anyway, let's see what others think.
>From a quick look at the code, it looks like the shmem alloc/fallback/charge
events are all lumped in with FILE_THP. But the instantaneous "how many are
allocated" and "how many are mapped" have their own NR_SHMEM_THPS and
NR_SHMEM_PMDMAPPED counters? So its a bit inconsistent today.
My preference would be to add these to be consistent with the anon stats:
MTHP_STAT_SHMEM_FAULT_ALLOC,
MTHP_STAT_SHMEM_FAULT_FALLBACK,
MTHP_STAT_SHMEM_FAULT_FALLBACK_CHARGE,
But it looks like these aren't always allocated due to faults? So perhaps:
MTHP_STAT_SHMEM_ALLOC,
MTHP_STAT_SHMEM_FALLBACK,
MTHP_STAT_SHMEM_FALLBACK_CHARGE,
If I've understood the code correctly (I know nothing about shmem), the
allocation can be for both mmap(SHARED|ANON) and for tmpfs? So "SHMEM_ANON"
probably isn't quite right?
>
>>> __MTHP_STAT_COUNT
>>> };
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 9e52c0db7580..dc15240c1ab3 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -557,6 +557,8 @@ DEFINE_MTHP_STAT_ATTR(anon_alloc_fallback,
>>> MTHP_STAT_ANON_ALLOC_FALLBACK);
>>> DEFINE_MTHP_STAT_ATTR(anon_swpout, MTHP_STAT_ANON_SWPOUT);
>>> DEFINE_MTHP_STAT_ATTR(anon_swpout_fallback, MTHP_STAT_ANON_SWPOUT_FALLBACK);
>>> DEFINE_MTHP_STAT_ATTR(anon_swpin_refault, MTHP_STAT_ANON_SWPIN_REFAULT);
>>> +DEFINE_MTHP_STAT_ATTR(shmem_anon_alloc, MTHP_STAT_SHMEM_ANON_ALLOC);
>>> +DEFINE_MTHP_STAT_ATTR(shmem_anon_alloc_fallback,
>>> MTHP_STAT_SHMEM_ANON_ALLOC_FALLBACK);
>>>
>>> static struct attribute *stats_attrs[] = {
>>> &anon_alloc_attr.attr,
>>> @@ -564,6 +566,8 @@ static struct attribute *stats_attrs[] = {
>>> &anon_swpout_attr.attr,
>>> &anon_swpout_fallback_attr.attr,
>>> &anon_swpin_refault_attr.attr,
>>> + &shmem_anon_alloc_attr.attr,
>>> + &shmem_anon_alloc_fallback_attr.attr,
>>> NULL,
>>> };
>>>
>>> diff --git a/mm/shmem.c b/mm/shmem.c
>>> index 8b009e7040b2..4a0aa75ab29c 100644
>>> --- a/mm/shmem.c
>>> +++ b/mm/shmem.c
>>> @@ -1706,11 +1706,14 @@ static struct folio *shmem_alloc_and_add_folio(struct
>>> vm_fault *vmf,
>>> pages = 1 << order;
>>> index = round_down(index, pages);
>>> folio = shmem_alloc_hugefolio(gfp, info, index, order);
>>> - if (folio)
>>> + if (folio) {
>>> + count_mthp_stat(order,
>>> MTHP_STAT_SHMEM_ANON_ALLOC);
is there any reason why this can't go next to the existing PMD-size stat?
>>> goto allocated;
>>> + }
>>>
>>> if (pages == HPAGE_PMD_NR)
>>> count_vm_event(THP_FILE_FALLBACK);
>>> + count_mthp_stat(order,
>>> MTHP_STAT_SHMEM_ANON_ALLOC_FALLBACK);
>>> order = next_order(&orders, order);
>>> }
>>> } else {
>>> --
>>> 2.39.3
>>>
>>
>> Thanks
>> Barry