Re: [PATCH net-next v6 5/5] veth: time-based BQL completion coalescing via ethtool tx-usecs

From: Simon Schippers

Date: Mon Jun 01 2026 - 12:31:21 EST


On 6/1/26 16:03, Jonas Köppeler wrote:
> On 6/1/26 2:00 PM, Simon Schippers wrote:
>> On 5/28/26 09:46, Jonas Köppeler wrote:
>>> On 5/27/26 3:54 PM, hawk@xxxxxxxxxx wrote:
>>>> From: Simon Schippers <simon.schippers@xxxxxxxxxxxxxx>
>>>>
>>>> Per-packet BQL completion forces DQL to converge on limit=2, causing
>>>> excessive NAPI scheduling overhead and qdisc requeues.
>>>>
>>>> Accumulate BQL completions and flush them when a configurable time
>>>> threshold is exceeded, letting DQL discover a limit that bounds actual
>>>> queuing delay to the configured interval. Coalescing state persists
>>>> across NAPI polls in struct veth_rq so completions can accumulate
>>>> beyond a single budget=64 cycle.
>>>>
>>>> Add ethtool tx-usecs support for runtime tuning. Default is 100 us;
>>>> setting tx-usecs to 0 disables coalescing and falls back to per-packet
>>>> completion.
>>>>
>>>> ethtool -C <veth-dev> tx-usecs 500 # 500us coalescing
>>>> ethtool -C <veth-dev> tx-usecs 0 # per-packet (no coalescing)
>>>>
>>>> Co-developed-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
>>>> Signed-off-by: Jesper Dangaard Brouer <hawk@xxxxxxxxxx>
>>>> Signed-off-by: Simon Schippers <simon.schippers@xxxxxxxxxxxxxx>
>>> Tested-by: Jonas Köppeler<j.koeppeler@xxxxxxxxxxxx>
>>>
>> Thanks for your testing!
>>
>> However, I have issues reproducing.
>> I run bare metal (without virtme) with v6 + your pktgen patch
>> and I am on the branch pktgen-and-benchmark, commit
>> "results: add veth-bql measurements":
>>
>> 1. ping fails with 100% packet loss ~20% of the times with --pktgen.
>> When this happens the avg ping of this run is mistakenly set
>> to 0.0 ms, which distorts the results.
>> I fixed it locally by rerunning when this happens.
>>
>> 2. pktgen runs with > 3 Mpps even with --nrules 10000, see log below.
>> I see that this is because of qdisc drops.
>> I also tried pfifo and sfq but with the same result.
>> I spent quite some time on it but I do not know a fix.
>>
>> Do you have an idea?
>> Thanks!
>
> Hi,
> yes there are some changes missing in the test script.
> I have pushed it now, sorry. This should fix 1.

I pulled it and ran...

sudo ./veth_bql_sweep.sh --runs 1 --pktgen --duration 20 --qdisc fq_codel --no-bpftrace

... but still 8/32=1/4 of the pings are zero, I do not see
a pattern.


I grabbed the logs from /tmp and this is what a failing
ping looks like:

PING 10.99.0.2 (10.99.0.2) 56(84) bytes of data.

--- 10.99.0.2 ping statistics ---
97 packets transmitted, 0 received, 100% packet loss, time 19967ms


Feels like a race or something..
Can you reproduce with the exact command?
I think you need --runs 1, else it just averages over multiple
runs.

> Regarding 2.: do not look at the pktgen output, in the
> new version you will see something like "goodput",
> which is the number you should look for.
> Pktgen will report at what speed it enqueued packets in
> the qdisc.

Exactly. Now it works. Had a single outlier but apart from that
everything is fine.

Thanks,
Simon

>
> Let me know if it worked.
> Best,
> Jonas
>