Re: [RFC PATCH v4 5/7] mm/demotion: Add support to associate rank with memory tier

From: Ying Huang
Date: Thu Jun 02 2022 - 02:42:28 EST


On Fri, 2022-05-27 at 17:55 +0530, Aneesh Kumar K.V wrote:
> The rank approach allows us to keep memory tier device IDs stable even if there
> is a need to change the tier ordering among different memory tiers. e.g. DRAM
> nodes with CPUs will always be on memtier1, no matter how many tiers are higher
> or lower than these nodes. A new memory tier can be inserted into the tier
> hierarchy for a new set of nodes without affecting the node assignment of any
> existing memtier, provided that there is enough gap in the rank values for the
> new memtier.
>
> The absolute value of "rank" of a memtier doesn't necessarily carry any meaning.
> Its value relative to other memtiers decides the level of this memtier in the tier
> hierarchy.
>
> For now, This patch supports hardcoded rank values which are 100, 200, & 300 for
> memory tiers 0,1 & 2 respectively.
>
> Below is the sysfs interface to read the rank values of memory tier,
> /sys/devices/system/memtier/memtierN/rank
>
> This interface is read only for now, write support can be added when there is
> a need of flexibility of more number of memory tiers(> 3) with flexibile ordering
> requirement among them, rank can be utilized there as rank decides now memory
> tiering ordering and not memory tier device ids.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
> ---
>  drivers/base/node.c | 5 +-
>  drivers/dax/kmem.c | 2 +-
>  include/linux/migrate.h | 17 ++--
>  mm/migrate.c | 218 ++++++++++++++++++++++++----------------
>  4 files changed, 144 insertions(+), 98 deletions(-)
>
> diff --git a/drivers/base/node.c b/drivers/base/node.c
> index cf4a58446d8c..892f7c23c94e 100644
> --- a/drivers/base/node.c
> +++ b/drivers/base/node.c
> @@ -567,8 +567,11 @@ static ssize_t memtier_show(struct device *dev,
>   char *buf)
>  {
>   int node = dev->id;
> + int tier_index = node_get_memory_tier_id(node);
>  
>
>
>
> - return sysfs_emit(buf, "%d\n", node_get_memory_tier(node));
> + if (tier_index != -1)
> + return sysfs_emit(buf, "%d\n", tier_index);
> + return 0;
>  }
>  
>
>
>
>  static ssize_t memtier_store(struct device *dev,
> diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
> index 991782aa2448..79953426ddaf 100644
> --- a/drivers/dax/kmem.c
> +++ b/drivers/dax/kmem.c
> @@ -149,7 +149,7 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
>   dev_set_drvdata(dev, data);
>  
>
>
>
>  #ifdef CONFIG_TIERED_MEMORY
> - node_set_memory_tier(numa_node, MEMORY_TIER_PMEM);
> + node_set_memory_tier_rank(numa_node, MEMORY_RANK_PMEM);

I think that we can work with memory tier ID inside kernel?

Best Regards,
Huang, Ying


[snip]