Re: [net-next v9 07/10] net: bnxt: Implement software USO

From: Joe Damato

Date: Wed Apr 08 2026 - 13:05:30 EST


On Tue, Apr 07, 2026 at 04:23:00PM -0700, Joe Damato wrote:
> On Tue, Apr 07, 2026 at 03:03:03PM -0700, Joe Damato wrote:
>
> [...]
>
> > v9:
> > - Added inline slot check to prevent possible overwriting of in-flight
> > headers (suggested by AI).
>
> [...]
>
> > netdev_tx_t bnxt_sw_udp_gso_xmit(struct bnxt *bp,
> > struct bnxt_tx_ring_info *txr,
> > struct netdev_queue *txq,
> > struct sk_buff *skb)
> > {
>
> [...]
>
> > +
> > + /* BD backpressure alone cannot prevent overwriting in-flight
> > + * headers in the inline buffer. Check slot availability directly.
> > + */
> > + slots = txr->tx_inline_prod - txr->tx_inline_cons;
> > + slots = BNXT_SW_USO_MAX_SEGS - slots;
> > +
> > + if (unlikely(slots < num_segs)) {
> > + netif_txq_try_stop(txq, slots, num_segs);
> > + return NETDEV_TX_BUSY;
>
> This is the check I added. AI says this is wrong and netdev_queues.h says:
>
> * @get_desc must be a formula or a function call, it must always
> * return up-to-date information when evaluated!
>
> which I obviously failed to do, so I'm pretty sure I got this wrong.

So, there's two options to fix this that I can think of. I am leaning torward
option 2, but if there are any strong opinions (or other options that I am
missing) please let me know:

1. Allocate the maximum number of slots per ring and eliminate this check
entirely. I figured this would be disliked because it potentially wastes
memory. The driver would need ring_size / 3 slots, and if we assume the
maximum is 2048 and the slot size is 256b, that works out to 175kb per
ring. Of course, this only affects NICs with SW USO and the buffer isn't
allocated for NICS with HW USO.

This is probably simpler, but costs more memory than the existing design.

2. Or, keep the smaller buffer that we have now (BNXT_SW_USO_MAX_SEGS (64)
* 256b = 16kb per ring) and fix the try_stop like this:

+static inline u16 bnxt_inline_avail(struct bnxt_tx_ring_info *txr)
+{
+ return BNXT_SW_USO_MAX_SEGS -
+ (u16)(txr->tx_inline_prod - READ_ONCE(txr->tx_inline_cons));
+}
+

[...]

- slots = txr->tx_inline_prod - txr->tx_inline_cons;
- slots = BNXT_SW_USO_MAX_SEGS - slots;
-
- if (unlikely(slots < num_segs)) {
- netif_txq_try_stop(txq, slots, num_segs);
+ if (unlikely(bnxt_inline_avail(txr) < num_segs)) {
+ netif_txq_try_stop(txq, bnxt_inline_avail(txr), num_segs);