Re: [PATCH v5 4/9] mm/demotion: Build demotion targets based on explicit memory tiers

From: Aneesh Kumar K V
Date: Wed Jun 08 2022 - 05:04:18 EST


On 6/8/22 12:20 PM, Ying Huang wrote:
On Fri, 2022-06-03 at 19:12 +0530, Aneesh Kumar K.V wrote:
This patch switch the demotion target building logic to use memory tiers
instead of NUMA distance. All N_MEMORY NUMA nodes will be placed in the
default tier 1 and additional memory tiers will be added by drivers like
dax kmem.

This patch builds the demotion target for a NUMA node by looking at all
memory tiers below the tier to which the NUMA node belongs. The closest node
in the immediately following memory tier is used as a demotion target.

Since we are now only building demotion target for N_MEMORY NUMA nodes
the CPU hotplug calls are removed in this patch.

The rank approach allows us to keep memory tier device IDs stable even if there
is a need to change the tier ordering among different memory tiers. e.g. DRAM
nodes with CPUs will always be on memtier1, no matter how many tiers are higher
or lower than these nodes. A new memory tier can be inserted into the tier
hierarchy for a new set of nodes without affecting the node assignment of any
existing memtier, provided that there is enough gap in the rank values for the
new memtier.

The absolute value of "rank" of a memtier doesn't necessarily carry any meaning.
Its value relative to other memtiers decides the level of this memtier in the tier
hierarchy.

For now, This patch supports hardcoded rank values which are 300, 200, & 100 for
memory tiers 0,1 & 2 respectively.

Below is the sysfs interface to read the rank values of memory tier,
/sys/devices/system/memtier/memtierN/rank

This interface is read only for now. Write support can be added when there is
a need of flexibility of more number of memory tiers(> 3) with flexibile ordering
requirement among them.

Suggested-by: Wei Xu <weixugc@xxxxxxxxxx>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxx>
---
 include/linux/memory-tiers.h | 5 +
 include/linux/migrate.h | 13 --
 mm/memory-tiers.c | 269 ++++++++++++++++++++++++
 mm/migrate.c | 394 -----------------------------------
 mm/vmstat.c | 4 -
 5 files changed, 274 insertions(+), 411 deletions(-)

It appears that you moved some code from migrate.c to memory-tiers.c and
change them. If so, please separate the change. That is, one patch
only move the code, the other change the code. This will make it easier
to find out what is changed.

That was how it was done in earlier version. That is we did change establish_migration within the same file. The changes we are doing here was so different that it was mentioned that it gets very hard to review
in a context diff. Hence this patch where we killed the old code and did the new code in memory-tiers.c. I could still move the code to memory-tiers.c and do the changes on top of that. Infact I do have a patch that does similar code movement in the series. But the diff was not useful for an easy review.

-aneesh