Re: [REGRESSION] Massive virtio-net throughput drop in guest VM with Linux 6.8+

From: Torsten Krah
Date: Fri Apr 04 2025 - 03:59:42 EST


Am Mittwoch, dem 02.04.2025 um 23:12 +0200 schrieb Markus Fohrer:
> When running on a host system equipped with a Broadcom NetXtreme-E
> (bnxt_en) NIC and AMD EPYC CPUs, the network throughput in the guest
> drops to 100–200 KB/s. The same guest configuration performs normally
> (~100 MB/s) when using kernel 6.8.0 or when the VM is moved to a host
> with Intel NICs.

Hi,

as I am affected too, here is the link to the Ubuntu issue, just in
case someone wants to have a look:

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2098961

We're seeing lots of those in dmesg output:

[ 561.505323] net_ratelimit: 1396 callbacks suppressed
[ 561.505339] ens18: bad gso: type: 4, size: 1448
[ 561.505343] ens18: bad gso: type: 4, size: 1448
[ 561.507270] ens18: bad gso: type: 4, size: 1448
[ 561.508257] ens18: bad gso: type: 4, size: 1448
[ 561.511432] ens18: bad gso: type: 4, size: 1448
[ 561.511452] ens18: bad gso: type: 4, size: 1448
[ 561.514719] ens18: bad gso: type: 4, size: 1448
[ 561.514966] ens18: bad gso: type: 4, size: 1448
[ 561.518553] ens18: bad gso: type: 4, size: 1448
[ 561.518781] ens18: bad gso: type: 4, size: 1448
[ 566.506044] net_ratelimit: 1363 callbacks suppressed


And another interesting thing we observed - at least in our environment
- that we can trigger that regression only with IPv4 traffic (bad
performance and lots of bad gso messages) - if we only use IPv6, it
does work (good performance and not one bad gso message).

kind regards

Torsten