[PATCH v2 0/4] net: ntb_netdev: Add Multi-queue support
From: Koichiro Den
Date: Sat Feb 28 2026 - 09:55:58 EST
Hi,
ntb_netdev currently hard-codes a single NTB transport queue pair, which
means the datapath effectively runs as a single-queue netdev regardless
of available CPUs / parallel flows.
The longer-term motivation here is throughput scale-out: allow
ntb_netdev to grow beyond the single-QP bottleneck and make it possible
to spread TX/RX work across multiple queue pairs as link speeds and core
counts keep increasing.
Multi-queue also unlocks the standard networking knobs on top of it. In
particular, once the device exposes multiple TX queues, qdisc/tc can
steer flows/traffic classes into different queues (via
skb->queue_mapping), enabling per-flow/per-class scheduling and QoS in a
familiar way.
Usage
=====
1. Ensure the NTB device you want to use has multiple Memory Windows.
2. modprobe ntb_transport on both sides, if it's not built-in.
3. modprobe ntb_netdev on both sides, if it's not built-in.
4. Use ethtool -L to configure the desired number of queues.
The default number of real (combined) queues is 1.
e.g. ethtool -L eth0 combined 2 # to increase
ethtool -L eth0 combined 1 # to reduce back to 1
Note:
* If the NTB device has only a single Memory Window, ethtool -L eth0
combined N (N > 1) fails with:
"netlink error: No space left on device".
* ethtool -L can be executed while the net_device is up.
Compatibility
=============
The default remains a single queue, so behavior is unchanged unless
the user explicitly increases the number of queues.
Kernel base
===========
ntb-next (latest as of 2026-02-28):
commit 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link
disable path")
Testing / Results
=================
Environment / command line:
- 2x R-Car S4 Spider boards
"Kernel base" (see above) + this series
TCP:
[RC] $ sudo iperf3 -s
[EP] $ sudo iperf3 -Z -c ${SERVER_IP} -l 65480 -w 512M -P 4
UDP:
[RC] $ sudo iperf3 -s
[EP] $ sudo iperf3 -ub0 -c ${SERVER_IP} -l 65480 -w 512M -P 4
Without this series:
TCP / UDP : 589 Mbps / 580 Mbps
With this series (default single queue):
TCP / UDP : 583 Mbps / 583 Mbps
With this series + `ethtool -L eth0 combined 2`:
TCP / UDP : 576 Mbps / 584 Mbps
With this series + `ethtool -L eth0 combined 2` + [1], where flows are
properly distributed across queues:
TCP / UDP : 1.12 Gbps / 1.17 Gbps
The 575~590 Mbps variation is run-to-run variance i.e. no measurable
regression or improvement is observed with a single queue. The key
point is scaling from ~600 Mbps to ~1.20 Gbps once flows are
distributed across multiple queues.
Note: On R-Car S4 Spider, only BAR2 is usable for ntb_transport MW.
For testing, BAR2 was expanded from 1 MiB to 2 MiB and split into two
Memory Windows. A follow-up series is planned to add split BAR support
for vNTB. On platforms where multiple BARs can be used for the
datapath, this series should allow >=2 queues without additional
changes.
[1] [PATCH v2 00/10] NTB: epf: Enable per-doorbell bit handling while keeping legacy offset
https://lore.kernel.org/linux-pci/20260227084955.3184017-1-den@xxxxxxxxxxxxx/
(subject was accidentally incorrect in the original posting)
Changelog
=========
Changes in v2:
- Drop the ntb_num_queues module parameter and implement ethtool
.set_channels().
v1 Patch 2-3 are dropped; v2 Patch 2-3 become preparatory changes
for the new Patch 4 implementing .set_channels().
- Drop unrelated changes from Patch 1 to keep it focused and easier to
review.
Best regards,
Koichiro
Koichiro Den (4):
net: ntb_netdev: Introduce per-queue context
net: ntb_netdev: Gate subqueue stop/wake by transport link
net: ntb_netdev: Factor out multi-queue helpers
net: ntb_netdev: Support ethtool channels for multi-queue
drivers/net/ntb_netdev.c | 483 +++++++++++++++++++++++++++++++--------
1 file changed, 386 insertions(+), 97 deletions(-)
--
2.51.0