答复: [外部邮件] Re: [PATCH][rdma-next] RDMA/erdma: Use NUMA-aware allocation for MTT tables

From: Li,Rongqing(ACG CCN)

Date: Wed Feb 25 2026 - 07:10:24 EST



> > On 2/25/26 4:51 PM, lirongqing wrote:
> > > From: Li RongQing <lirongqing@xxxxxxxxx>
> > >
> > > Currently, MTT (Memory Translation Table) buffers are allocated
> > > without NUMA awareness using kzalloc() and vzalloc(), which allocate
> > > memory on the NUMA node of the calling CPU. This can lead to
> > > cross-node memory access latencies if the erdma device is attached
> > > to a different NUMA socket.
> > >
> > > Switch to kzalloc_node() and vzalloc_node() to ensure MTT buffers
> > > are allocated on the local NUMA node of the PCIe device
> (dev->attrs.numa_node).
> > > This reduces latency for hardware access and improves performance.
> > >
> > > Signed-off-by: Li RongQing <lirongqing@xxxxxxxxx>
> > > ---
> > > drivers/infiniband/hw/erdma/erdma_verbs.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> >
> > Hi, Li RongQing,
> >
> > Thanks for the patch. However, I think it is better to keep the
> > current behavior, for the following reasons:
> >
> > 1. This path is in the control plane, so allocating memory from a remote
> > NUMA node should not have a noticeable performance impact.
>
> If TLB Miss , or the internal cache misses , does the HCA need to query the MTT?
>
> [Li,Rongqing]
>
> > 2. With this change, the driver may fail the allocation when the local NUMA
> > node is out of memory, even if other nodes still have available memory.
> >


When kmalloc_node() is called without __GFP_THISNODE and the target node
lacks sufficient memory, SLUB allocates a folio from a different node
other than the requested node.

So I think this is not a problem.

[Li,Rongqing]



> > Thanks,
> > Cheng Xu
> >
> > > diff --git a/drivers/infiniband/hw/erdma/erdma_verbs.c
> > > b/drivers/infiniband/hw/erdma/erdma_verbs.c
> > > index 9f74aad..58da6ef 100644
> > > --- a/drivers/infiniband/hw/erdma/erdma_verbs.c
> > > +++ b/drivers/infiniband/hw/erdma/erdma_verbs.c
> > > @@ -604,7 +604,7 @@ static struct erdma_mtt
> > *erdma_create_cont_mtt(struct erdma_dev *dev,
> > > return ERR_PTR(-ENOMEM);
> > >
> > > mtt->size = size;
> > > - mtt->buf = kzalloc(mtt->size, GFP_KERNEL);
> > > + mtt->buf = kzalloc_node(mtt->size, GFP_KERNEL,
> > > +dev->attrs.numa_node);
> > > if (!mtt->buf)
> > > goto err_free_mtt;
> > >
> > > @@ -729,7 +729,7 @@ static struct erdma_mtt
> > *erdma_create_scatter_mtt(struct erdma_dev *dev,
> > > return ERR_PTR(-ENOMEM);
> > >
> > > mtt->size = ALIGN(size, PAGE_SIZE);
> > > - mtt->buf = vzalloc(mtt->size);
> > > + mtt->buf = vzalloc_node(mtt->size, dev->attrs.numa_node);
> > > mtt->continuous = false;
> > > if (!mtt->buf)
> > > goto err_free_mtt;