Re: [RFC] Memory Tiering

From: Huang\, Ying
Date: Thu Oct 24 2019 - 23:30:50 EST


Dave Hansen <dave.hansen@xxxxxxxxx> writes:

> On 10/23/19 4:11 PM, Jonathan Adams wrote:
>> we would have a bidirectional attachment:
>>
>> A is marked "move cold pages to" B
>> B is marked "move hot pages to" A
>> C is marked "move cold pages to" D
>> D is marked "move hot pages to" C
>>
>> By using autonuma for moving PMEM pages back to DRAM, you avoid
>> needing the B->A & D->C links, at the cost of migrating the pages
>> back synchronously at pagefault time (assuming my understanding of how
>> autonuma works is accurate).
>>
>> Our approach still lets you have multiple levels of hierarchy for a
>> given socket (you could imaging an "E" node with the same relation to
>> "B" as "B" has to "A"), but doesn't make it easy to represent (say) an
>> "E" which was equally close to all sockets (which I could imagine for
>> something like remote memory on GenZ or what-have-you), since there
>> wouldn't be a single back link; there would need to be something like
>> your autonuma support to achieve that.
>>
>> Does that make sense?
>
> Yes, it does. We've actually tried a few other approaches separate from
> autonuma-based ones for promotion. For some of those, we have a
> promotion path which is separate from the demotion path.
>
> That said, I took a quick look to see what the autonuma behavior was and
> couldn't find anything obvious. Ying, when moving a slow page due to
> autonuma, do we move it close to the CPU that did the access, or do we
> promote it to the DRAM close to the slow memory where it is now?

Now in autonuma, the slow page will be moved to the CPU that did the
access. So I think Jonathan's requirement has been covered already.

Best Regards,
Huang, Ying