Re: [PATCH v14 04/10] mm/demotion/dax/kmem: Set node's abstract distance to MEMTIER_DEFAULT_DAX_ADISTANCE
From: Bharata B Rao
Date: Tue Aug 16 2022 - 05:45:53 EST
On 8/16/2022 12:58 PM, huang ying wrote:
> On Tue, Aug 16, 2022 at 1:10 PM Aneesh Kumar K V
> <aneesh.kumar@xxxxxxxxxxxxx> wrote:
>>
>> On 8/15/22 8:09 AM, Huang, Ying wrote:
>>> "Aneesh Kumar K.V" <aneesh.kumar@xxxxxxxxxxxxx> writes:
>>>
>
> [snip]
>
>>>>
>>>> +/*
>>>> + * Default abstract distance assigned to the NUMA node onlined
>>>> + * by DAX/kmem if the low level platform driver didn't initialize
>>>> + * one for this NUMA node.
>>>> + */
>>>> +#define MEMTIER_DEFAULT_DAX_ADISTANCE (MEMTIER_ADISTANCE_DRAM * 2)
>>>
>>> If my understanding were correct, this is targeting Optane DCPMM for
>>> now. The measured results in the following paper is,
>>>
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Farxiv.org%2Fpdf%2F2002.06018.pdf&data=05%7C01%7Cbharata%40amd.com%7C1c5015b55ff849e5275408da7f58e67d%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637962317187856589%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=SxSC8WaKEeTyfZXoqtI%2FZAoBXXp82PnTeyyavrV%2FGGg%3D&reserved=0
>>>
>>> Section: 2.1 Read/Write Latencies
>>>
>>> "
>>> For read access, the latency of DCPMM was 400.1% higher than that of
>>> DRAM. For write access, it was 407.1% higher.
>>> "
>>>
>>> Section: 2.2 Read/Write Bandwidths
>>>
>>> "
>>> For read access, the throughput of DCPMM was 37.1% of DRAM. For write
>>> access, it was 7.8%
>>> "
>>>
>>> According to the above data, I think the MEMTIER_DEFAULT_DAX_ADISTANCE
>>> can be "5 * MEMTIER_ADISTANCE_DRAM".
>>>
>>
>> If we look at mapping every 100% increase in latency as a memory tier, we essentially
>> will have 4 memory tier here. Each memory tier is covering a range of abstract distance 128.
>> which makes a total adistance increase from MEMTIER_ADISTANCE_DRAM by 512. This puts
>> DEFAULT_DAX_DISTANCE at 1024 or MEMTIER_ADISTANCE_DRAM * 2
>
> If my understanding were correct, you are suggesting to use a kind of
> logarithmic mapping from latency to abstract distance? That is,
>
> abstract_distance = log2(latency)
>
> While I am suggesting to use a kind of linear mapping from latency to
> abstract distance. That is,
>
> abstract_distance = C * latency
>
> I think that linear mapping is easy to understand.
>
> Are there some good reasons to use logarithmic mapping?
Also, what is the recommendation for using bandwidth measure which
may be available from HMAT for CXL memory? How is bandwidth going
to influence the abstract distance?
Regards,
Bharata.