RE: [PATCH V3,net-next] net: mana: Add page pool for RX buffers

From: Haiyang Zhang
Date: Mon Jul 24 2023 - 11:46:10 EST




> -----Original Message-----
> From: Jesper Dangaard Brouer <jbrouer@xxxxxxxxxx>
> Sent: Monday, July 24, 2023 7:29 AM
> To: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>; linux-hyperv@xxxxxxxxxxxxxxx;
> netdev@xxxxxxxxxxxxxxx
> Cc: brouer@xxxxxxxxxx; Dexuan Cui <decui@xxxxxxxxxxxxx>; KY Srinivasan
> <kys@xxxxxxxxxxxxx>; Paul Rosswurm <paulros@xxxxxxxxxxxxx>;
> olaf@xxxxxxxxx; vkuznets@xxxxxxxxxx; davem@xxxxxxxxxxxxx;
> wei.liu@xxxxxxxxxx; edumazet@xxxxxxxxxx; kuba@xxxxxxxxxx;
> pabeni@xxxxxxxxxx; leon@xxxxxxxxxx; Long Li <longli@xxxxxxxxxxxxx>;
> ssengar@xxxxxxxxxxxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx;
> daniel@xxxxxxxxxxxxx; john.fastabend@xxxxxxxxx; bpf@xxxxxxxxxxxxxxx;
> ast@xxxxxxxxxx; Ajay Sharma <sharmaajay@xxxxxxxxxxxxx>; hawk@xxxxxxxxxx;
> tglx@xxxxxxxxxxxxx; shradhagupta@xxxxxxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Ilias Apalodimas <ilias.apalodimas@xxxxxxxxxx>; Jesper
> Dangaard Brouer <hawk@xxxxxxxxxx>
> Subject: Re: [PATCH V3,net-next] net: mana: Add page pool for RX buffers
>
>
>
> On 21/07/2023 21.05, Haiyang Zhang wrote:
> > Add page pool for RX buffers for faster buffer cycle and reduce CPU
> > usage.
> >
> > The standard page pool API is used.
> >
> > Signed-off-by: Haiyang Zhang <haiyangz@xxxxxxxxxxxxx>
> > ---
> > V3:
> > Update xdp mem model, pool param, alloc as suggested by Jakub Kicinski
> > V2:
> > Use the standard page pool API as suggested by Jesper Dangaard Brouer
> >
> > ---
> > drivers/net/ethernet/microsoft/mana/mana_en.c | 91 +++++++++++++++----
> > include/net/mana/mana.h | 3 +
> > 2 files changed, 78 insertions(+), 16 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c
> b/drivers/net/ethernet/microsoft/mana/mana_en.c
> > index a499e460594b..4307f25f8c7a 100644
> > --- a/drivers/net/ethernet/microsoft/mana/mana_en.c
> > +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c
> [...]
> > @@ -1659,6 +1679,8 @@ static void mana_poll_rx_cq(struct mana_cq *cq)
> >
> > if (rxq->xdp_flush)
> > xdp_do_flush();
> > +
> > + page_pool_nid_changed(rxq->page_pool, numa_mem_id());
>
> I don't think this page_pool_nid_changed() called is needed, if you do
> as I suggest below (nid = NUMA_NO_NODE).
>
>
> > }
> >
> > static int mana_cq_handler(void *context, struct gdma_queue
> *gdma_queue)
> [...]
>
> > @@ -2008,6 +2041,25 @@ static int mana_push_wqe(struct mana_rxq *rxq)
> > return 0;
> > }
> >
> > +static int mana_create_page_pool(struct mana_rxq *rxq)
> > +{
> > + struct page_pool_params pprm = {};
>
> You are implicitly assigning NUMA node id zero.
>
> > + int ret;
> > +
> > + pprm.pool_size = RX_BUFFERS_PER_QUEUE;
> > + pprm.napi = &rxq->rx_cq.napi;
>
> You likely want to assign pprm.nid to NUMA_NO_NODE
>
> pprm.nid = NUMA_NO_NODE;
>
> For most drivers it is recommended to assign ``NUMA_NO_NODE`` (value -1)
> as the NUMA ID to ``pp_params.nid``. When ``CONFIG_NUMA`` is enabled
> this setting will automatically select the (preferred) NUMA node (via
> ``numa_mem_id()``) based on where NAPI RX-processing is currently
> running. The effect is that page_pool will only use recycled memory when
> NUMA node match running CPU. This assumes CPU refilling driver RX-ring
> will also run RX-NAPI.
>
> If a driver want more control over the NUMA node memory selection,
> drivers can assign (``pp_params.nid``) something else than
> `NUMA_NO_NODE`` and runtime adjust via function
> ``page_pool_nid_changed()``.

Our driver is using NUMA 0 by default, so I implicitly assign NUMA node id
to zero during pool init.

And, if the IRQ/CPU affinity is changed, the page_pool_nid_changed()
will update the nid for the pool. Does this sound good?

Thanks,
-Haiyang