Re: RFC: Memory Tiering Kernel Interfaces (v2)

From: ying.huang@xxxxxxxxx
Date: Thu May 12 2022 - 03:19:15 EST


On Thu, 2022-05-12 at 12:42 +0530, Aneesh Kumar K V wrote:
> On 5/12/22 12:33 PM, ying.huang@xxxxxxxxx wrote:
> > On Wed, 2022-05-11 at 23:22 -0700, Wei Xu wrote:
> > > Sysfs Interfaces
> > > ================
> > >
> > > * /sys/devices/system/memtier/memtierN/nodelist
> > >
> > >    where N = 0, 1, 2 (the kernel supports only 3 tiers for now).
> > >
> > >    Format: node_list
> > >
> > >    Read-only. When read, list the memory nodes in the specified tier.
> > >
> > >    Tier 0 is the highest tier, while tier 2 is the lowest tier.
> > >
> > >    The absolute value of a tier id number has no specific meaning.
> > >    What matters is the relative order of the tier id numbers.
> > >
> > >    When a memory tier has no nodes, the kernel can hide its memtier
> > >    sysfs files.
> > >
> > > * /sys/devices/system/node/nodeN/memtier
> > >
> > >    where N = 0, 1, ...
> > >
> > >    Format: int or empty
> > >
> > >    When read, list the memory tier that the node belongs to. Its value
> > >    is empty for a CPU-only NUMA node.
> > >
> > >    When written, the kernel moves the node into the specified memory
> > >    tier if the move is allowed. The tier assignment of all other nodes
> > >    are not affected.
> > >
> > >    Initially, we can make this interface read-only.
> >
> > It seems that "/sys/devices/system/node/nodeN/memtier" has all
> > information we needed. Do we really need
> > "/sys/devices/system/memtier/memtierN/nodelist"?
> >
> > That can be gotten via a simple shell command line,
> >
> > $ grep . /sys/devices/system/node/nodeN/memtier | sort -n -k 2 -t ':'
> >
>
> It will be really useful to fetch the memory tier node list in an easy
> fashion rather than reading multiple sysfs directories. If we don't have
> other attributes for memorytier, we could keep
> "/sys/devices/system/memtier/memtierN" a NUMA node list there by
> avoiding /sys/devices/system/memtier/memtierN/nodelist

This will make the interface not extensible. Even a single file
"/sys/devices/system/node/memtiers" is better. As an readonly file, it
should be OK to put multiple values in it.

I remember that one rule for sysfs is that it is accessed more via
libsysfs. Does that make life easier?

Best Regards,
Huang, Ying