Re: [PATCH v1] mm: shmem: Rename mTHP shmem counters

From: Ryan Roberts
Date: Tue Jul 09 2024 - 03:47:31 EST


On 08/07/2024 21:50, David Hildenbrand wrote:
> On 08.07.24 14:29, Ryan Roberts wrote:
>> On 08/07/2024 12:36, Barry Song wrote:
>>> On Mon, Jul 8, 2024 at 11:24 PM Ryan Roberts <ryan.roberts@xxxxxxx> wrote:
>>>>
>>>> The legacy PMD-sized THP counters at /proc/vmstat include
>>>> thp_file_alloc, thp_file_fallback and thp_file_fallback_charge, which
>>>> rather confusingly refer to shmem THP and do not include any other types
>>>> of file pages. This is inconsistent since in most other places in the
>>>> kernel, THP counters are explicitly separated for anon, shmem and file
>>>> flavours. However, we are stuck with it since it constitutes a user ABI.
>>>>
>>>> Recently, commit 66f44583f9b6 ("mm: shmem: add mTHP counters for
>>>> anonymous shmem") added equivalent mTHP stats for shmem, keeping the
>>>> same "file_" prefix in the names. But in future, we may want to add
>>>> extra stats to cover actual file pages, at which point, it would all
>>>> become very confusing.
>>>>
>>>> So let's take the opportunity to rename these new counters "shmem_"
>>>> before the change makes it upstream and the ABI becomes immutable.
>>>
>>> Personally, I think this approach is much clearer. However, I recall
>>> we discussed this
>>> before [1], and it seems that inconsistency is a concern?
>>
>> Embarrassingly, I don't recall that converstation at all :-| but at least what I
>> said then is consistent with what I've done in this patch.
>>
>> I think David's conclusion from that thread was to call them FILE_, and add both
>> shmem and pagecache counts to those counters, meaning we can keep the same name
>> as legacy THP counters. But those legacy THP counters only count shmem, and I
>> don't think we would get away with adding pagecache counts to those at this
>> point? (argument: they have been around for long time and there is a risk that
>> user space relies on them and if they were to dramatically increase due to
>> pagecache addition now that could break things). In that case, there is still
>> inconsistency, but its worse; the names are consistent but the semantics are
>> inconsistent.
>>
>> So my vote is to change to SHMEM_ as per this patch :)
>
> I also forgot most of the discussion, but these 3 legacy counters are really
> only (currently) incremented for shmem. I think my idea was to keep everything
> as FILE_ for now, maybe at some point make the pagecache also use them, and then
> maybe have separate FILE_ + SHMEM_.
>
> But yeah, likely it's best to only have "shmem" here for now, because who knows
> what we can actually change about the legacy counters. But it's always though
> messing with legacy stuff that is clearly suboptimal ...

Sorry David, I've read your response a bunch of times and am still not
completely sure what you are advocating for.

To make my proposal crystal clear, I think we should leave the legacy counters
alone - neither change their name nor what they count (which is _only_ shmem). I
think we should rename the new mTHP counters to "shmem" and have them continue
to only count shmem. Then additionally, I think we should introduce new "file"
mTHP counters that count the page cache allocations (that's a future set of
patches, which I'm working on together with user controls to determine which
mTHP sizes can be used by page cache).

As suggested by Barry, I propose to also improve the documentation for the
legacy counters to make it clear that dispite their name being "file" they are
actually counting "shmem". I'll do this for v2.

David, would you support this approach? If so, I'd like to push this forwards
asap so that it gets into v6.11 to avoid ever exposing the mthp counters with
the "file" name.

Thanks,
Ryan