Re: [RFC PATCH 0/2] Hot page promotion optimization for large address space

From: Huang, Ying
Date: Fri Apr 12 2024 - 05:03:15 EST


Bharata B Rao <bharata@xxxxxxx> writes:

> On 12-Apr-24 12:58 PM, Huang, Ying wrote:
>> Bharata B Rao <bharata@xxxxxxx> writes:
>>
>>> On 03-Apr-24 2:10 PM, Huang, Ying wrote:
>>>>> Here are the numbers for the 192nd chunk:
>>>>>
>>>>> Each iteration of 262144 random accesses takes around ~10ms
>>>>> 512 such iterations are taking ~5s
>>>>> numa_scan_seq is 16 when this chunk is accessed.
>>>>> And no page promotions were done from this chunk. All the
>>>>> time should_numa_migrate_memory() found the NUMA hint fault
>>>>> latency to be higher than threshold.
>>>>>
>>>>> Are these time periods considered too short for the pages
>>>>> to be detected as hot and promoted?
>>>>
>>>> Yes. I think so. This is burst accessing, not repeated accessing.
>>>> IIUC, NUMA balancing based promotion only works for repeated accessing
>>>> for long time, for example, >100s.
>>>
>>> Hmm... When a page is accessed 512 times over a period of 5s and it is
>>> still not detected as hot. This is understandable if fresh scanning couldn't
>>> be done as the accesses were bursty and hence they couldn't be captured via
>>> NUMA hint faults. But here the access captured via hint fault is being rejected
>>> as not hot because the scanning was done a while back. But I do see the challenge
>>> here since we depend on scanning time to obtain the frequency-of-access metric.
>>
>> Consider some pages that will be accessed once every 1 hour, should we
>> consider it hot or not? Will your proposed method deal with that
>> correctly?
>
> The proposed method removes the absolute time as a factor for the decision and instead
> relies on the number of hint faults that have occurred since that page was scanned last.
> As long as there are enough hint faults happening in that 1 hour (which means a lot many
> other accesses have been captured in that 1 hour), that page shouldn't be considered as
> hot. You did mention earlier about hint fault rate varying a lot and one thing I haven't
> tried yet is to vary the fault threshold based on current or historical fault rate.

In your original example, if a lot many other accesses between NUMA
balancing page table scanning and 512 page accesses, you cannot identify
the page as hot too, right?

If the NUMA balancing page table scanning period is much longer than 5s,
it's high possible that we cannot distinguish between 1 and 512 page
accesses within 5s with your method and the original method.

Better discuss the behavior with a more detail example, for example,
when the page is scanned, how many pages are accessed, how long between
accesses, etc.

--
Best Regards,
Huang, Ying