[PATCH 5.18 203/339] xsk: Fix handling of invalid descriptors in XSK TX batching API

From: Greg Kroah-Hartman
Date: Mon Jun 13 2022 - 09:57:44 EST


From: Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx>

[ Upstream commit d678cbd2f867a564a3c5b276c454e873f43f02f8 ]

xdpxceiver run on a AF_XDP ZC enabled driver revealed a problem with XSK
Tx batching API. There is a test that checks how invalid Tx descriptors
are handled by AF_XDP. Each valid descriptor is followed by invalid one
on Tx side whereas the Rx side expects only to receive a set of valid
descriptors.

In current xsk_tx_peek_release_desc_batch() function, the amount of
available descriptors is hidden inside xskq_cons_peek_desc_batch(). This
can be problematic in cases where invalid descriptors are present due to
the fact that xskq_cons_peek_desc_batch() returns only a count of valid
descriptors. This means that it is impossible to properly update XSK
ring state when calling xskq_cons_release_n().

To address this issue, pull out the contents of
xskq_cons_peek_desc_batch() so that callers (currently only
xsk_tx_peek_release_desc_batch()) will always be able to update the
state of ring properly, as total count of entries is now available and
use this value as an argument in xskq_cons_release_n(). By
doing so, xskq_cons_peek_desc_batch() can be dropped altogether.

Fixes: 9349eb3a9d2a ("xsk: Introduce batched Tx descriptor interfaces")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@xxxxxxxxx>
Signed-off-by: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
Acked-by: Magnus Karlsson <magnus.karlsson@xxxxxxxxx>
Link: https://lore.kernel.org/bpf/20220607142200.576735-1-maciej.fijalkowski@xxxxxxxxx
Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>
---
net/xdp/xsk.c | 5 +++--
net/xdp/xsk_queue.h | 8 --------
2 files changed, 3 insertions(+), 10 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 3a9348030e20..d6bcdbfd0fc5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -373,7 +373,8 @@ u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries)
goto out;
}

- nb_pkts = xskq_cons_peek_desc_batch(xs->tx, pool, max_entries);
+ max_entries = xskq_cons_nb_entries(xs->tx, max_entries);
+ nb_pkts = xskq_cons_read_desc_batch(xs->tx, pool, max_entries);
if (!nb_pkts) {
xs->tx->queue_empty_descs++;
goto out;
@@ -389,7 +390,7 @@ u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max_entries)
if (!nb_pkts)
goto out;

- xskq_cons_release_n(xs->tx, nb_pkts);
+ xskq_cons_release_n(xs->tx, max_entries);
__xskq_cons_release(xs->tx);
xs->sk.sk_write_space(&xs->sk);

diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index 801cda5d1938..64b43f31942f 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -282,14 +282,6 @@ static inline bool xskq_cons_peek_desc(struct xsk_queue *q,
return xskq_cons_read_desc(q, desc, pool);
}

-static inline u32 xskq_cons_peek_desc_batch(struct xsk_queue *q, struct xsk_buff_pool *pool,
- u32 max)
-{
- u32 entries = xskq_cons_nb_entries(q, max);
-
- return xskq_cons_read_desc_batch(q, pool, entries);
-}
-
/* To improve performance in the xskq_cons_release functions, only update local state here.
* Reflect this to global state when we get new entries from the ring in
* xskq_cons_get_entries() and whenever Rx or Tx processing are completed in the NAPI loop.
--
2.35.1