Re: [PATCH net-next v2 1/2] virtio_net: xsk: fix race in rx wake up

From: Bui Quang Minh

Date: Thu Jun 11 2026 - 12:34:12 EST


On 6/11/26 09:56, menglong8.dong@xxxxxxxxx wrote:
From: Menglong Dong <dongml2@xxxxxxxxxxxxxxx>

During packet receiving in virtio-net, the rq can be empty, which means
"rq->vq->num_free == virtqueue_get_vring_size(rq->vq)", in
virtnet_add_recvbuf_xsk(), if we are using xsk. Meanwhile, the fill ring
can be empty too, which means we can't allocate anything from
xsk_buff_alloc_batch(). Then, we will set the XDP_RING_NEED_WAKEUP flag.

However, if the user clean all the data in rx ring and fill the
"fill ring" and check the XDP_RING_NEED_WAKEUP flag after
xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(), then the rx
napi will never be scheduled: the rx ring is empty, which means we will
never receive a packet to trigger the further recv fill. The rx ring is
empty now, so the user will not check the flag too.

Fix this by set the XDP_RING_NEED_WAKEUP flag before
xsk_buff_alloc_batch() if both rq->vq and fill ring are empty.

Meanwhile, set the XDP_RING_NEED_WAKEUP flag if we have any free entry in
rq->vq.

Fixes: e3f8800aa243 ("virtio-net: xsk: Support wakeup on RX side")
Signed-off-by: Menglong Dong <dongml2@xxxxxxxxxxxxxxx>
---
drivers/net/virtio_net.c | 25 ++++++++++++++++++++++---
1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f4adcfee7a80..4b5b3fa62008 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1323,16 +1323,27 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
struct xsk_buff_pool *pool, gfp_t gfp)
{
struct xdp_buff **xsk_buffs;
+ bool need_wakeup;
dma_addr_t addr;
int err = 0;
u32 len, i;
int num;
+ need_wakeup = xsk_uses_need_wakeup(pool);
xsk_buffs = rq->xsk_buffs;
+ /* If both rq->vq and fill ring are empty, and then the user submit
+ * all the chunks to the fill ring and check the wake up flag
+ * after xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(),
+ * we will lose the chance to wake up the rx napi, so we have to
+ * set the need_wakeup flag here.
+ */
+ if (need_wakeup && virtqueue_get_vring_size(rq->vq) == rq->vq->num_free)
+ xsk_set_rx_need_wakeup(pool);

I think when polling the receive queue, the userspace program needs to check the XDP_RING_NEED_WAKEUP flag if it does not see any packets. The flag check is quite lightweight in my opinion. Here are some examples I find

- https://github.com/xdp-project/xdp-tools/blob/e9469501622aa22a7e452a671000bec8685edcde/lib/util/xdpsock.c#L1206
- https://github.com/xdp-project/bpf-examples/blob/43e565901c4287efa863edca7f0e6cd6e35ed896/AF_XDP-forwarding/xsk_fwd.c#L540

Furthermore, the XDP_RING_NEED_WAKEUP flag related functions does not provide any memory orderings. So even with your patch, I'm worried that this case is possible

kernel userspace

xsk_buff_alloc_batch -> failed
                                                            submit fill ring
                                                            flag != XDP_RING_NEED_WAKEUP
// reordering due to lack of memory orderings
xsk_set_rx_need_wakeup

I'm not expert here, so correct me if I'm wrong. I think the wake up flag is designed with no orderings so we cannot rely on it to reason and skip further checks.

+
num = xsk_buff_alloc_batch(pool, xsk_buffs, rq->vq->num_free);
if (!num) {
- if (xsk_uses_need_wakeup(pool)) {
+ if (need_wakeup) {
xsk_set_rx_need_wakeup(pool);
/* Return 0 instead of -ENOMEM so that NAPI is
* descheduled.
@@ -1341,8 +1352,6 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
}
return -ENOMEM;
- } else {
- xsk_clear_rx_need_wakeup(pool);
}
len = xsk_pool_get_rx_frame_size(pool) + vi->hdr_len;
@@ -1363,6 +1372,16 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info *vi, struct receive_queue
goto err;
}
+ if (need_wakeup) {
+ if (rq->vq->num_free)
+ /* We have free buffers, so we'd better wake up the
+ * rx napi as soon as possible.
+ */
+ xsk_set_rx_need_wakeup(pool);
+ else
+ xsk_clear_rx_need_wakeup(pool);
+ }
+

Why do we need to set XDP_RING_NEED_WAKEUP even when xsk_buff_alloc_batch succeeds?

return num;
err:

Thanks,
Quang Minh.