Re: [RFC] cxl/region: set numa node for target memdevs when a region is committed
From: Dan Williams
Date: Tue Mar 18 2025 - 17:25:57 EST
Dave Jiang wrote:
>
>
> On 3/14/25 9:40 AM, nifan.cxl@xxxxxxxxx wrote:
> > From: Fan Ni <fan.ni@xxxxxxxxxxx>
> >
> > There is a sysfs attribute named "numa_node" for cxl memory device.
> > however, it is never set so -1 is returned whenever it is read.
> >
> > With this change, the numa_node of each target memdev is set based on the
> > start address of the hpa_range of the endpoint decoder it associated when a
> > cxl region is created; and it is reset when the region decoders are
> > reset.
> >
> > Open qeustion: do we need to set the numa_node when the memdev is
> > probed instead of waiting until a region is created?
>
> Typically, the numa node for a PCI device should be dev_to_node(),
> where the device resides. So when the device is probed, it should be
> set with that. See documentation [1]. Region should have its own NUMA
> node based on phys_to_target_node() of the starting address.
Right, the memdev node is the affinity of device-MMIO to a CPU. The
HDM-memory that the device decodes may land in multiple proximity
domains and is subject to CDAT, CXL QoS, HMAT Generic Port, etc...
If your memdev node is "NUMA_NO_NODE" then that likely means the
affinity information for the PCI device is missing.
I would double check that first. See set_dev_node() in device_add().