Re: [PATCH net] xsk: correct tx_ring_empty_descs count statistics

From: Wang Liang
Date: Tue Apr 01 2025 - 03:40:08 EST



在 2025/4/1 14:57, Magnus Karlsson 写道:
On Tue, 1 Apr 2025 at 04:36, Wang Liang <wangliang74@xxxxxxxxxx> wrote:

在 2025/4/1 6:03, Stanislav Fomichev 写道:
On 03/31, Stanislav Fomichev wrote:
On 03/29, Wang Liang wrote:
The tx_ring_empty_descs count may be incorrect, when set the XDP_TX_RING
option but do not reserve tx ring. Because xsk_poll() try to wakeup the
driver by calling xsk_generic_xmit() for non-zero-copy mode. So the
tx_ring_empty_descs count increases once the xsk_poll()is called:

xsk_poll
xsk_generic_xmit
__xsk_generic_xmit
xskq_cons_peek_desc
xskq_cons_read_desc
q->queue_empty_descs++;
Sorry, but I do not understand how to reproduce this error. So you
first issue a setsockopt with the XDP_TX_RING option and then you do
not "reserve tx ring". What does that last "not reserve tx ring" mean?
No mmap() of that ring, or something else? I guess you have bound the
socket with a bind()? Some pseudo code on how to reproduce this would
be helpful. Just want to understand so I can help. Thank you.

Ok. Some pseudo code like below: fd = socket(AF_XDP, SOCK_RAW, 0); setsockopt(fd, SOL_XDP, XDP_UMEM_REG, &mr, sizeof(mr)); setsockopt(fd, SOL_XDP, XDP_UMEM_FILL_RING, &fill_size, sizeof(fill_size)); setsockopt(fd, SOL_XDP, XDP_UMEM_COMPLETION_RING, &comp_size, sizeof(comp_size)); mmap(NULL, off.fr.desc + fill_size * sizeof(__u64), ..., XDP_UMEM_PGOFF_FILL_RING); mmap(NULL, off.cr.desc + comp_size * sizeof(__u64), ..., XDP_UMEM_PGOFF_COMPLETION_RING); setsockopt(fd, SOL_XDP, XDP_RX_RING, &rx_size, sizeof(rx_size)); setsockopt(fd, SOL_XDP, XDP_TX_RING, &tx_size, sizeof(tx_size)); mmap(NULL, off.rx.desc + rx_size * sizeof(struct xdp_desc), ..., XDP_PGOFF_RX_RING); mmap(NULL, off.tx.desc + tx_size * sizeof(struct xdp_desc), ..., XDP_PGOFF_TX_RING); bind(fd, (struct sockaddr *)&sxdp, sizeof(sxdp)); bpf_map_update_elem(xsk_map_fd, &queue_id, &fd, 0); while(!global_exit) { poll(fds, 1, -1); handle_receive_packets(...); } The xsk is created success, and xs->tx is initialized. The "not reserve tx ring" means user app do not update tx ring producer. Like: xsk_ring_prod__reserve(tx, 1, &tx_idx); xsk_ring_prod__tx_desc(tx, tx_idx)->addr = frame; xsk_ring_prod__tx_desc(tx, tx_idx)->len = pkg_length; xsk_ring_prod__submit(tx, 1); These functions (xsk_ring_prod__reserve, etc.) is provided by libxdp. The tx->producer is not updated, so the xs->tx->cached_cons and xs->tx->cached_prod are always zero. When receive packets and user app call poll(), xsk_generic_xmit() will be triggered by xsk_poll(), leading to this issue.