Re: [EXT] Re: [RFC PATCH v2 0/2] Node migration between memory tiers

From: Huang, Ying
Date: Mon Dec 18 2023 - 22:59:27 EST


Hi, Srinivasulu,

Please use a email client that works for kernel patch review. Your
email is hard to read. It's hard to identify which part is your text
and which part is my text. Please refer to,

https://www.kernel.org/doc/html/latest/process/email-clients.html

Or something similar, for example,

https://elinux.org/Mail_client_tips

Srinivasulu Thanneeru <sthanneeru@xxxxxxxxxx> writes:

> Micron Confidential
>
>
>
> Micron Confidential
> ________________________________________
> From: Huang, Ying <ying.huang@xxxxxxxxx>
> Sent: Friday, December 15, 2023 10:32 AM
> To: Srinivasulu Opensrc
> Cc: linux-cxl@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx; Srinivasulu
> Thanneeru; aneesh.kumar@xxxxxxxxxxxxx; dan.j.williams@xxxxxxxxx;
> gregory.price; mhocko@xxxxxxxx; tj@xxxxxxxxxx; john@xxxxxxxxxxxxxx;
> Eishan Mirakhur; Vinicius Tavares Petrucci; Ravis OpenSrc;
> Jonathan.Cameron@xxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx
> Subject: [EXT] Re: [RFC PATCH v2 0/2] Node migration between memory tiers
>
> CAUTION: EXTERNAL EMAIL. Do not click links or open attachments unless you recognize the sender and were expecting this message.
>
>
> <sthanneeru.opensrc@xxxxxxxxxx> writes:
>
>> From: Srinivasulu Thanneeru <sthanneeru.opensrc@xxxxxxxxxx>
>>
>> The memory tiers feature allows nodes with similar memory types
>> or performance characteristics to be grouped together in a
>> memory tier. However, there is currently no provision for
>> moving a node from one tier to another on demand.
>>
>> This patch series aims to support node migration between tiers
>> on demand by sysadmin/root user using the provided sysfs for
>> node migration.
>>
>> To migrate a node to a tier, the corresponding node’s sysfs
>> memtier_override is written with target tier id.
>>
>> Example: Move node2 to memory tier2 from its default tier(i.e 4)
>>
>> 1. To check current memtier of node2
>> $cat /sys/devices/system/node/node2/memtier_override
>> memory_tier4
>>
>> 2. To migrate node2 to memory_tier2
>> $echo 2 > /sys/devices/system/node/node2/memtier_override
>> $cat /sys/devices/system/node/node2/memtier_override
>> memory_tier2
>>
>> Usecases:
>>
>> 1. Useful to move cxl nodes to the right tiers from userspace, when
>> the hardware fails to assign the tiers correctly based on
>> memorytypes.
>>
>> On some platforms we have observed cxl memory being assigned to
>> the same tier as DDR memory. This is arguably a system firmware
>> bug, but it is true that tiers represent *ranges* of performance
>> and we believe it's important for the system operator to have
>> the ability to override bad firmware or OS decisions about tier
>> assignment as a fail-safe against potential bad outcomes.
>>
>> 2. Useful if we want interleave weights to be applied on memory tiers
>> instead of nodes.
>> In a previous thread, Huang Ying <ying.huang@xxxxxxxxx> thought
>> this feature might be useful to overcome limitations of systems
>> where nodes with different bandwidth characteristics are grouped
>> in a single tier.
>> https://lore.kernel.org/lkml/87a5rw1wu8.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>>
>> =============
>> Version Notes:
>>
>> V2 : Changed interface to memtier_override from adistance_offset.
>> memtier_override was recommended by
>> 1. John Groves <john@xxxxxxxxxxxxxx>
>> 2. Ravi Shankar <ravis.opensrc@xxxxxxxxxx>
>> 3. Brice Goglin <Brice.Goglin@xxxxxxxx>
>
> It appears that you ignored my comments for V1 as follows ...
>
> https://lore.kernel.org/lkml/87o7f62vur.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> Thank you Huang, Ying for pointing to this.
>
> https://lpc.events/event/16/contributions/1209/attachments/1042/1995/Live%20In%20a%20World%20With%20Multiple%20Memory%20Types.pdf
>
> In the presentation above, the adistance_offsets are per memtype.
> We believe that adistance_offset per node is more suitable and flexible
> since we can change it per node. If we keep adistance_offset per memtype,
> then we cannot change it for a specific node of a given memtype.

Why do you need to change it for a specific node? Why do you needn't to
chagne it for all nodes of a given memtype?

> https://lore.kernel.org/lkml/87jzpt2ft5.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> I guess that you need to move all NUMA nodes with same performance
> metrics together? If so, That is why we previously proposed to place
> the knob in "memory_type"? (From: Huang, Ying )
>
> Yes, memory_type would be group the related memories togather as single tier.
> We should also have a flexibility to move nodes between tiers, to address the issues described in usecases above.
>
> https://lore.kernel.org/lkml/87a5qp2et0.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> This patch provides a way to move a node to the correct tier.
> We observed in test setups where DRAM and CXL are put under the same
> tier (memory_tier4).
> By using this patch, we can move the CXL node away from the DRAM-linked
> tier4 and put it in the desired tier.

Good! Can you give more details? So I can resend the patch with your
supporting data.

--
Best Regards,
Huang, Ying

> Regards,
> Srini
>
> --
> Best Regards,
> Huang, Ying
>
>> V1 : Introduced adistance_offset sysfs.
>>
>> =============
>>
>> Srinivasulu Thanneeru (2):
>> base/node: Add sysfs for memtier_override
>> memory tier: Support node migration between tiers
>>
>> Documentation/ABI/stable/sysfs-devices-node | 7 ++
>> drivers/base/node.c | 47 ++++++++++++
>> include/linux/memory-tiers.h | 11 +++
>> include/linux/node.h | 11 +++
>> mm/memory-tiers.c | 85 ++++++++++++---------
>> 5 files changed, 125 insertions(+), 36 deletions(-)