Re: [RFC PATCH v4 4/7] mm/demotion/dax/kmem: Set node's memory tier to MEMORY_TIER_PMEM

From: Aneesh Kumar K V
Date: Wed Jun 01 2022 - 09:50:39 EST


On 6/1/22 11:59 AM, Bharata B Rao wrote:
On 5/27/2022 5:55 PM, Aneesh Kumar K.V wrote:
From: Jagdish Gediya <jvgediya@xxxxxxxxxxxxx>

By default, all nodes are assigned to DEFAULT_MEMORY_TIER which
is memory tier 1 which is designated for nodes with DRAM, so it
is not the right tier for dax devices.

Set dax kmem device node's tier to MEMORY_TIER_PMEM, In future,
support should be added to distinguish the dax-devices which should
not be MEMORY_TIER_PMEM and right memory tier should be set for them.

Signed-off-by: Jagdish Gediya <jvgediya@xxxxxxxxxxxxx>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
---
drivers/dax/kmem.c | 4 ++++
mm/migrate.c | 2 ++
2 files changed, 6 insertions(+)

diff --git a/drivers/dax/kmem.c b/drivers/dax/kmem.c
index a37622060fff..991782aa2448 100644
--- a/drivers/dax/kmem.c
+++ b/drivers/dax/kmem.c
@@ -11,6 +11,7 @@
#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/mman.h>
+#include <linux/migrate.h>
#include "dax-private.h"
#include "bus.h"
@@ -147,6 +148,9 @@ static int dev_dax_kmem_probe(struct dev_dax *dev_dax)
dev_set_drvdata(dev, data);
+#ifdef CONFIG_TIERED_MEMORY
+ node_set_memory_tier(numa_node, MEMORY_TIER_PMEM);
+#endif

I was experimenting with this patchset and found this behaviour.
Here's what I did:

Boot a KVM guest with vNVDIMM device which ends up with device_dax
driver by default.

Use it as RAM by binding it to dax kmem driver. It now appears as
RAM with a new NUMA node that is put to memtier1 (the existing tier
where DRAM already exists)


That should have placed it in memtier2.

I can move it to memtier2 (MEMORY_RANK_PMEM) manually, but isn't
that expected to happen automatically when a node with dax kmem
device comes up?


This can happen if we have added the same NUMA node to memtier1 before dax kmem driver initialized the pmem memory. Can you check before the above node_set_memory_tier_rank() whether the specific NUMA node is already part of any memory tier?

Thank you for testing the patchset.
-aneesh