TCP and BBR: reproducibly low cwnd and bandwidth

From: Oleksandr Natalenko
Date: Thu Feb 15 2018 - 15:42:35 EST


Hello.

I've faced an issue with a limited TCP bandwidth between my laptop and a
server in my 1 Gbps LAN while using BBR as a congestion control mechanism. To
verify my observations, I've set up 2 KVM VMs with the following parameters:

1) Linux v4.15.3
2) virtio NICs
3) 128 MiB of RAM
4) 2 vCPUs
5) tested on both non-PREEMPT/100 Hz and PREEMPT/1000 Hz

The VMs are interconnected via host bridge (-netdev bridge). I was running
iperf3 in the default and reverse mode. Here are the results:

1) BBR on both VMs

upload: 3.42 Gbits/sec, cwnd ~ 320 KBytes
download: 3.39 Gbits/sec, cwnd ~ 320 KBytes

2) Reno on both VMs

upload: 5.50 Gbits/sec, cwnd = 976 KBytes (constant)
download: 5.22 Gbits/sec, cwnd = 1.20 MBytes (constant)

3) Reno on client, BBR on server

upload: 5.29 Gbits/sec, cwnd = 952 KBytes (constant)
download: 3.45 Gbits/sec, cwnd ~ 320 KBytes

4) BBR on client, Reno on server

upload: 3.36 Gbits/sec, cwnd ~ 370 KBytes
download: 5.21 Gbits/sec, cwnd = 887 KBytes (constant)

So, as you may see, when BBR is in use, upload rate is bad and cwnd is low. If
using real HW (1 Gbps LAN, laptop and server), BBR limits the throughput to
~100 Mbps (verifiable not only by iperf3, but also by scp while transferring
some files between hosts).

Also, I've tried to use YeAH instead of Reno, and it gives me the same results
as Reno (IOW, YeAH works fine too).

Questions:

1) is this expected?
2) or am I missing some extra BBR tuneable?
3) if it is not a regression (I don't have any previous data to compare with),
how can I fix this?
4) if it is a bug in BBR, what else should I provide or check for a proper
investigation?

Thanks.

Regards,
Oleksandr