Re: [PATCH v2] mm/demotion: Expose memory tier details via sysfs

From: Aneesh Kumar K V
Date: Tue Aug 30 2022 - 03:28:50 EST


On 8/30/22 12:47 PM, Wei Xu wrote:
> On Mon, Aug 29, 2022 at 11:46 PM Aneesh Kumar K V
> <aneesh.kumar@xxxxxxxxxxxxx> wrote:
>>
>> On 8/30/22 12:01 PM, Wei Xu wrote:
>>> On Sun, Aug 28, 2022 at 11:08 PM Aneesh Kumar K.V
>>> <aneesh.kumar@xxxxxxxxxxxxx> wrote:
>>>>
>>>> This patch adds /sys/devices/virtual/memory_tiering/ where all memory tier
>>>> related details can be found. All allocated memory tiers will be listed
>>>> there as /sys/devices/virtual/memory_tiering/memory_tierN/
>>>>
>>>> The nodes which are part of a specific memory tier can be listed via
>>>> /sys/devices/virtual/memory_tiering/memory_tierN/nodes
>>>>
>>>> The abstract distance range value of a specific memory tier can be listed via
>>>> /sys/devices/virtual/memory_tiering/memory_tierN/abstract_distance
>>>>
>>>> A directory hierarchy looks like
>>>> :/sys/devices/virtual/memory_tiering$ tree memory_tier4/
>>>> memory_tier4/
>>>> ├── abstract_distance
>>>> ├── nodes
>>>> ├── subsystem -> ../../../../bus/memory_tiering
>>>> └── uevent
>>>>
>>>> All toptier nodes are listed via
>>>> /sys/devices/virtual/memory_tiering/toptier_nodes
>>>>
>>>> :/sys/devices/virtual/memory_tiering$ cat toptier_nodes
>>>> 0,2
>>>> :/sys/devices/virtual/memory_tiering$ cat memory_tier4/nodes
>>>> 0,2
>>>> :/sys/devices/virtual/memory_tiering$ cat memory_tier4/abstract_distance
>>>> 512 - 639
>>>>
>>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
>>>> ---
>>>> .../ABI/testing/sysfs-kernel-mm-memory-tiers | 41 +++++
>>>> mm/memory-tiers.c | 155 +++++++++++++++---
>>>> 2 files changed, 174 insertions(+), 22 deletions(-)
>>>> create mode 100644 Documentation/ABI/testing/sysfs-kernel-mm-memory-tiers
>>>>
>>>> diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-memory-tiers b/Documentation/ABI/testing/sysfs-kernel-mm-memory-tiers
>>>> new file mode 100644
>>>> index 000000000000..6955f69a4423
>>>> --- /dev/null
>>>> +++ b/Documentation/ABI/testing/sysfs-kernel-mm-memory-tiers
>>>> @@ -0,0 +1,41 @@
>>>> +What: /sys/devices/virtual/memory_tiering/
>>>> +Date: August 2022
>>>> +Contact: Linux memory management mailing list <linux-mm@xxxxxxxxx>
>>>> +Description: A collection of all the memory tiers allocated.
>>>> +
>>>> + Individual memory tier details are contained in subdirectories
>>>> + named by the abstract distance of the memory tier.
>>>> +
>>>> + /sys/devices/virtual/memory_tiering/memory_tierN/
>>>> +
>>>> +
>>>> +What: /sys/devices/virtual/memory_tiering/memory_tierN/
>>>> + /sys/devices/virtual/memory_tiering/memory_tierN/abstract_distance
>>>> + /sys/devices/virtual/memory_tiering/memory_tierN/nodes
>>>> +Date: August 2022
>>>> +Contact: Linux memory management mailing list <linux-mm@xxxxxxxxx>
>>>> +Description: Directory with details of a specific memory tier
>>>> +
>>>> + This is the directory containing information about a particular
>>>> + memory tier, memtierN, where N is derived based on abstract distance.
>>>> +
>>>> + A smaller value of N implies a higher (faster) memory tier in the
>>>> + hierarchy.
>>>
>>> Given that abstract_distance is provided, it would be more flexible if
>>> we don't commit to the interface where N in memtierN also indicates
>>> the memory tier ordering.
>>
>>
>> IIUC this is one of the request that Johannes had ie, to be able to understand the
>> memory tier hierarchy based on memtier name.
>>
>>>> +
>>>> + abstract_distance: The abstract distance range this specific memory
>>>> + tier maps to.
>>>
>>> I still think the name of "abstract distance" is kind of confusing
>>> because it is not clear what is the other object that this distance
>>> value is relative to. Do we have to expose this value at this point
>>> if N in memtierN can already indicate the memory tier ordering?
>>>
>>
>> I do agree that abstract distance is confusing. But IIUC we agreed that it is much better
>> than other names suggested and is closer to already understood "numa distance" term.
>>
>> https://lore.kernel.org/linux-mm/YuLF%2FGG8x5lQvg%2Ff@xxxxxxxxxxx/
>>
>
> "NUMA distance" measures the distance between two NUMA nodes.
>
> I bring it up again because this name will become a user visible
> kernel interface, which we will need to live with for a long time.
> Even if we decide to keep the name, it would be better if we can
> define between which two (abstract) points the abstract distance
> reports.
>
> Another option is to remove this interface for now until it becomes
> necessary to report abstract distances to userspace.
>

Ok I will send a v3 with abstract_distance dropped.

-aneesh