Re: [External Mail] [RFC PATCH v2] Weighted interleave auto-tuning

From: Huang, Ying
Date: Sun Dec 22 2024 - 02:22:32 EST


Hyeonggon Yoo <hyeonggon.yoo@xxxxxx> writes:

> On 2024-12-20 4:18 AM, Joshua Hahn wrote:

[snip]

>
> By the way, this might be out of scope, but let me ask for my own
> learning.
>
> We have a server with 2 sockets, each attached with local DRAM and CXL
> memory (and thus 4 NUMA nodes). When accessing remote socket's memory
> (either CXL or not), the bandwidth is limited by the interconnect's
> bandwidth.
>
> On this server, ideally weighted interleaving should be configured
> within a socket (e.g. local NUMA node + local CXL node) because
> weighted interleaving does not consider the bandwidth when accessed
> from a remote socket.

If multiple sockets are considered, what is the best behavior?

The process may be cross-socket too. So, we will need to use
set_mempolicy() to bind tasks to sockets firstly. Then, it may be
better to use per-task weights.

> So, the question is: On systems with multiple sockets (and CXL mem
> attached to each socket), do you always assume the admin must bind to
> a specific socket for optimal performance or is there any plan to
> mitigate this problem without binding tasks to a socket?
>

[snip]

---
Best Regards,
Huang, Ying