RE: [EXT] Re: [PATCH net-next v2] octeontx2-pf: Add support for page pool
From: Sunil Kovvuri Goutham
Date: Fri May 19 2023 - 01:20:22 EST
> -----Original Message-----
> From: Yunsheng Lin <linyunsheng@xxxxxxxxxx>
> Sent: Friday, May 19, 2023 8:07 AM
> To: Ratheesh Kannoth <rkannoth@xxxxxxxxxxx>; netdev@xxxxxxxxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx
> Cc: Sunil Kovvuri Goutham <sgoutham@xxxxxxxxxxx>; davem@xxxxxxxxxxxxx;
> edumazet@xxxxxxxxxx; kuba@xxxxxxxxxx; pabeni@xxxxxxxxxx; Subbaraya
> Sundeep Bhatta <sbhatta@xxxxxxxxxxx>; Geethasowjanya Akula
> <gakula@xxxxxxxxxxx>; Srujana Challa <schalla@xxxxxxxxxxx>; Hariprasad
> Kelam <hkelam@xxxxxxxxxxx>
> Subject: [EXT] Re: [PATCH net-next v2] octeontx2-pf: Add support for page pool
>
> External Email
>
> ----------------------------------------------------------------------
> On 2023/5/19 9:52, Ratheesh Kannoth wrote:
> >> ---------------------------------------------------------------------
> >> - On 2023/5/18 13:51, Ratheesh Kannoth wrote:
> >>> Page pool for each rx queue enhance rx side performance by
> >>> reclaiming buffers back to each queue specific pool. DMA mapping is
> >>> done only for first allocation of buffers.
> >>> As subsequent buffers allocation avoid DMA mapping, it results in
> >>> performance improvement.
> >>>
> >>> Image | Performance with Linux kernel Packet Generator
> >>
> >> Is there any more detailed info for the performance data?
> >> 'kernel Packet Generator' means using pktgen module in the
> >> net/core/pktgen.c? it seems pktgen is more for tx, is there any
> >> abvious reason why the page pool optimization for rx have brought
> >> about ten times improvement?
> > We used packet generator for TX machine. Performance data is for RX
> > DUT. I will remove Packet generator text from the commit message as it gives
> ambiguous information
> > DUT Rx <------------------------- TX (Linux machine with packet generator)
> > (page pool support)
>
> Thanks for clarifying.
> DUT is for 'Device Under Test'?
> what does DUT do after it receive a packet? XDP DROP?
>
> >
> >>
> >>> ------------ | -----------------------------------------------
> >>> Vannila | 3Mpps
> >>> |
> >>> with this | 42Mpps
> >>> change |
> >>> -------------------------------------------------------------
> >>>
> >>
> >> ...
> >>
> >>> static int __otx2_alloc_rbuf(struct otx2_nic *pfvf, struct otx2_pool *pool,
> >>> dma_addr_t *dma)
> >>> {
> >>> u8 *buf;
> >>>
> >>> + if (pool->page_pool)
> >>> + return otx2_alloc_pool_buf(pfvf, pool, dma);
> >>> +
> >>> buf = napi_alloc_frag_align(pool->rbsize, OTX2_ALIGN);
> >>> if (unlikely(!buf))
> >>> return -ENOMEM;
> >>
> >> It seems the above is dead code when using 'select PAGE_POOL', as
> >> PAGE_POOL config is always selected by the driver?
> > _otx2_alloc_rbuf() is common code for RX and TX. For RX, pool->page_pool
> != NULL, so allocation is from page pool.
> >
>
> Am I missing something here? 'buf' is dma-mapped with DMA_FROM_DEVICE,
> can it be used for TX?
>
> Also, what does 'r' in _otx2_alloc_rbuf() mean?
>
HW takes care of cache coherency between device and CPU, hence DMA_ATTR_SKIP_CPU_SYNC
was used. Direction of DMA doesn't matter here. Hence instead of duplicating the same API
' otx2_alloc_rbuf' was used for both Rx and Tx. 'r' stands for receive.
Thanks,
Sunil.