Re: [PATCH][rdma-next] RDMA/erdma: Use NUMA-aware allocation for MTT tables

From: Cheng Xu

Date: Wed Feb 25 2026 - 06:34:13 EST




On 2/25/26 4:51 PM, lirongqing wrote:
> From: Li RongQing <lirongqing@xxxxxxxxx>
>
> Currently, MTT (Memory Translation Table) buffers are allocated without
> NUMA awareness using kzalloc() and vzalloc(), which allocate memory on
> the NUMA node of the calling CPU. This can lead to cross-node memory
> access latencies if the erdma device is attached to a different NUMA
> socket.
>
> Switch to kzalloc_node() and vzalloc_node() to ensure MTT buffers are
> allocated on the local NUMA node of the PCIe device (dev->attrs.numa_node).
> This reduces latency for hardware access and improves performance.
>
> Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx>
> ---
> drivers/infiniband/hw/erdma/erdma_verbs.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>

Hi, Li RongQing,

Thanks for the patch. However, I think it is better to keep the current
behavior, for the following reasons:

1. This path is in the control plane, so allocating memory from a remote
NUMA node should not have a noticeable performance impact.
2. With this change, the driver may fail the allocation when the local NUMA
node is out of memory, even if other nodes still have available memory.

Thanks,
Cheng Xu

> diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c b/drivers/infiniband/hw/erdma/erdma_verbs.c
> index 9f74aad..58da6ef 100644
> --- a/drivers/infiniband/hw/erdma/erdma_verbs.c
> +++ b/drivers/infiniband/hw/erdma/erdma_verbs.c
> @@ -604,7 +604,7 @@ static struct erdma_mtt *erdma_create_cont_mtt(struct erdma_dev *dev,
> return ERR_PTR(-ENOMEM);
>
> mtt->size = size;
> - mtt->buf = kzalloc(mtt->size, GFP_KERNEL);
> + mtt->buf = kzalloc_node(mtt->size, GFP_KERNEL, dev->attrs.numa_node);
> if (!mtt->buf)
> goto err_free_mtt;
>
> @@ -729,7 +729,7 @@ static struct erdma_mtt *erdma_create_scatter_mtt(struct erdma_dev *dev,
> return ERR_PTR(-ENOMEM);
>
> mtt->size = ALIGN(size, PAGE_SIZE);
> - mtt->buf = vzalloc(mtt->size);
> + mtt->buf = vzalloc_node(mtt->size, dev->attrs.numa_node);
> mtt->continuous = false;
> if (!mtt->buf)
> goto err_free_mtt;