Re: [RFC] cxl/region: set numa node for target memdevs when a region is committed

From: Dave Jiang
Date: Tue Mar 18 2025 - 17:00:37 EST




On 3/14/25 9:40 AM, nifan.cxl@xxxxxxxxx wrote:
> From: Fan Ni <fan.ni@xxxxxxxxxxx>
>
> There is a sysfs attribute named "numa_node" for cxl memory device.
> however, it is never set so -1 is returned whenever it is read.
>
> With this change, the numa_node of each target memdev is set based on the
> start address of the hpa_range of the endpoint decoder it associated when a
> cxl region is created; and it is reset when the region decoders are
> reset.
>
> Open qeustion: do we need to set the numa_node when the memdev is
> probed instead of waiting until a region is created?

Typically, the numa node for a PCI device should be dev_to_node(), where the device resides. So when the device is probed, it should be set with that. See documentation [1]. Region should have its own NUMA node based on phys_to_target_node() of the starting address.

[1]: https://elixir.bootlin.com/linux/v6.14-rc6/source/Documentation/ABI/testing/sysfs-bus-cxl#L85

DJ

>
> Signed-off-by: Fan Ni <fan.ni@xxxxxxxxxxx>
> ---
> drivers/cxl/core/region.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> diff --git a/drivers/cxl/core/region.c b/drivers/cxl/core/region.c
> index e8d11a988fd9..935ee0b1dd26 100644
> --- a/drivers/cxl/core/region.c
> +++ b/drivers/cxl/core/region.c
> @@ -242,6 +242,13 @@ static int cxl_region_invalidate_memregion(struct cxl_region *cxlr)
> return 0;
> }
>
> +static void cxl_mem_reset_numa_node(struct cxl_endpoint_decoder *cxled)
> +{
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> +
> + cxlmd->dev.numa_node = NUMA_NO_NODE;
> +}
> +
> static void cxl_region_decode_reset(struct cxl_region *cxlr, int count)
> {
> struct cxl_region_params *p = &cxlr->params;
> @@ -264,6 +271,7 @@ static void cxl_region_decode_reset(struct cxl_region *cxlr, int count)
> if (cxlds->rcd)
> goto endpoint_reset;
>
> + cxl_mem_reset_numa_node(cxled);
> while (!is_cxl_root(to_cxl_port(iter->dev.parent)))
> iter = to_cxl_port(iter->dev.parent);
>
> @@ -304,6 +312,15 @@ static int commit_decoder(struct cxl_decoder *cxld)
> return 0;
> }
>
> +static void cxl_mem_set_numa_node(struct cxl_endpoint_decoder *cxled)
> +{
> + struct cxl_memdev *cxlmd = cxled_to_memdev(cxled);
> + u64 addr = cxled->cxld.hpa_range.start;
> +
> + cxlmd->dev.numa_node = phys_to_target_node(addr);
> + dev_dbg(&cxlmd->dev, "set numa node: %d\n", phys_to_target_node(addr));
> +}
> +
> static int cxl_region_decode_commit(struct cxl_region *cxlr)
> {
> struct cxl_region_params *p = &cxlr->params;
> @@ -340,6 +357,7 @@ static int cxl_region_decode_commit(struct cxl_region *cxlr)
> cxled->cxld.reset(&cxled->cxld);
> goto err;
> }
> + cxl_mem_set_numa_node(cxled);
> }
>
> return 0;