Re: [RFC 2/2] xdp: Delegate fast path return decision to page_pool

From: Jesper Dangaard Brouer
Date: Tue Dec 02 2025 - 09:01:06 EST





On 01/12/2025 11.12, Dragos Tatulea wrote:


[...]
And then you can run thus command:
sudo ./xdp-bench redirect-map --load-egress mlx5p1 mlx5p1

Ah, yes! I was ignorant about the egress part of the program.
That did the trick. The drop happens before reaching the tx
queue of the second netdev and the mentioned code in devmem.c
is reached.

Sender is xdp-trafficgen with 3 threads pushing enough on one RX queue
to saturate the CPU.

Here's what I got:

* before:

eth2->eth3 16,153,328 rx/s 16,153,329 err,drop/s 0 xmit/s
xmit eth2->eth3 0 xmit/s 16,153,329 drop/s 0 drv_err/s 16.00 bulk-avg
eth2->eth3 16,152,538 rx/s 16,152,546 err,drop/s 0 xmit/s
xmit eth2->eth3 0 xmit/s 16,152,546 drop/s 0 drv_err/s 16.00 bulk-avg
eth2->eth3 16,156,331 rx/s 16,156,337 err,drop/s 0 xmit/s
xmit eth2->eth3 0 xmit/s 16,156,337 drop/s 0 drv_err/s 16.00 bulk-avg

* after:

eth2->eth3 16,105,461 rx/s 16,105,469 err,drop/s 0 xmit/s
xmit eth2->eth3 0 xmit/s 16,105,469 drop/s 0 drv_err/s 16.00 bulk-avg
eth2->eth3 16,119,550 rx/s 16,119,541 err,drop/s 0 xmit/s
xmit eth2->eth3 0 xmit/s 16,119,541 drop/s 0 drv_err/s 16.00 bulk-avg
eth2->eth3 16,092,145 rx/s 16,092,154 err,drop/s 0 xmit/s
xmit eth2->eth3 0 xmit/s 16,092,154 drop/s 0 drv_err/s 16.00 bulk-avg

So slightly worse... I don't fully trust the measurements though as I
saw the inverse situation in other tests as well: higher rate after the
patch.

Remember that you are also removing some code (the
xdp_set_return_frame_no_direct and xdp_clear_return_frame_no_direct).
Thus, I was actually hoping we would see a higher rate after the patch.
This is why I wanted to see this XDP-redirect test, instead of the
page_pool micro-benchmark.


I had a chance to re-run this on a more stable system and the conclusion
is the same. Performance is ~2 % worse:

* before:
eth2->eth3 13,746,431 rx/s 13,746,471 err,drop/s 0 xmit/s
xmit eth2->eth3 0 xmit/s 13,746,471 drop/s 0 drv_err/s 16.00 bulk-avg

* after:
eth2->eth3 13,437,277 rx/s 13,437,259 err,drop/s 0 xmit/s
xmit eth2->eth3 0 xmit/s 13,437,259 drop/s 0 drv_err/s 16.00 bulk-avg

After this experiment it doesn't seem like this direction is worth
proceeding with... I was more optimistic at the start.

I do think it is worth proceeding. I will claim that your PPS results
are basically the same. Converting PPS number to nanosec per packet:

13,746,471 = (1/13746471*10^9) = 72.74 nanosec
13,437,259 = (1/13437259*10^9) = 74.42 nanosec
Difference is = (74.42-72.75) = 1.67 nanosec

In my experience it is very hard to find a system stable enough to
measure a 2 nanosec difference. As you also note you had to spend effort
finding a stable system. Thus, I claim your results show no noticeable
performance impact.

My only concern (based on your perf symbols) is that you might not be
testing the right/expected code path. If mlx5 is running with a
page_pool memory mode that have elevated refcnf on the page, then we
will not be exercising the slower page_pool ptr_ring return path as much
as expected. I guess, I have to do this experiment in my own testlab on
other NIC drivers that doesn't use elevated refcnt as default.


Toke (and I) will appreciate if you added code for this to xdp-bench.
Supporting a --program-mode like 'redirect-cpu' does.


Ok. I will add it.

Added it here:
https://github.com/xdp-project/xdp-tools/pull/532


Thanks, I'll take a look, and I'm sure Toke have opinions on the cmdline
options and the missing man-page update.

--Jesper