Re: [External Mail] [RFC PATCH v2] Weighted interleave auto-tuning

From: Huang, Ying
Date: Tue Dec 24 2024 - 18:49:00 EST


Gregory Price <gourry@xxxxxxxxxx> writes:

> On Sun, Dec 22, 2024 at 03:21:32PM +0800, Huang, Ying wrote:
>> Hyeonggon Yoo <hyeonggon.yoo@xxxxxx> writes:
>>
>> > On this server, ideally weighted interleaving should be configured
>> > within a socket (e.g. local NUMA node + local CXL node) because
>> > weighted interleaving does not consider the bandwidth when accessed
>> > from a remote socket.
>>
>> If multiple sockets are considered, what is the best behavior?
>>
>> The process may be cross-socket too. So, we will need to use
>> set_mempolicy() to bind tasks to sockets firstly. Then, it may be
>> better to use per-task weights.
>>
>
> If we want to revisit this, we might be able to make task-local weights
> work without a new syscall, but the use case was not clear enough which
> is why it was soft-nak'd originally.

Yes. That is doable. However, the challenge is lacking use cases. I
guess that we can wait for more use cases?

> vma-local weights are arguably more usable, but require the task to be
> numa-aware and probably require a new mempolicy syscall because mbind
> has no remaining arguments.
>
> recall my original testing results from stream:
> https://lore.kernel.org/linux-mm/20240202170238.90004-1-gregory.price@xxxxxxxxxxxx/
>
> Stream Benchmark (vs DRAM, 1 Socket + 1 CXL Device)
> Default interleave : -78% (slower than DRAM)
> Global weighting : -6% to +4% (workload dependant)
> Targeted weights : +2.5% to +4% (consistently better than DRAM)

---
Best Regards,
Huang, Ying