Re: [PATCH 0/3] net: ntb_netdev: Add Multi-queue support
From: Dave Jiang
Date: Tue Feb 24 2026 - 11:21:54 EST
On 2/24/26 8:28 AM, Koichiro Den wrote:
> Hi,
>
> ntb_netdev currently hard-codes a single NTB transport queue pair, which
> means the datapath effectively runs as a single-queue netdev regardless
> of available CPUs / parallel flows.
>
> The longer-term motivation here is throughput scale-out: allow
> ntb_netdev to grow beyond the single-QP bottleneck and make it possible
> to spread TX/RX work across multiple queue pairs as link speeds and core
> counts keep increasing.
>
> Multi-queue also unlocks the standard networking knobs on top of it. In
> particular, once the device exposes multiple TX queues, qdisc/tc can
> steer flows/traffic classes into different queues (via
> skb->queue_mapping), enabling per-flow/per-class scheduling and QoS in a
> familiar way.
>
> This series is a small plumbing step towards that direction:
>
> 1) Introduce a per-queue context object (struct ntb_netdev_queue) and
> move queue-pair state out of struct ntb_netdev. Probe creates queue
> pairs in a loop and configures the netdev queue counts to match the
> number that was successfully created.
>
> 2) Expose ntb_num_queues as a module parameter to request multiple
> queue pairs at probe time. The value is clamped to 1..64 and kept
> read-only for now (no runtime reconfiguration).
>
> 3) Report the active queue-pair count via ethtool -l (get_channels),
> so users can confirm the device configuration from user space.
>
> Compatibility:
> - Default remains ntb_num_queues=1, so behaviour is unchanged unless
> the user explicitly requests more queues.
>
> Kernel base:
> - ntb-next latest:
> commit 7b3302c687ca ("ntb_hw_amd: Fix incorrect debug message in link
> disable path")
>
> Usage (example):
> - modprobe ntb_netdev ntb_num_queues=<N> # Patch 2 takes care of it
> - ethtool -l <ifname> # Patch 3 takes care of it
>
> Patch summary:
> 1/3 net: ntb_netdev: Introduce per-queue context
> 2/3 net: ntb_netdev: Make queue pair count configurable
> 3/3 net: ntb_netdev: Expose queue pair count via ethtool -l
>
> Testing / results:
> Environment / command line:
> - 2x R-Car S4 Spider boards
> "Kernel base" (see above) + this series
> - For TCP load:
> [RC] $ sudo iperf3 -s
> [EP] $ sudo iperf3 -Z -c ${SERVER_IP} -l 65480 -w 512M -P 4
> - For UDP load:
> [RC] $ sudo iperf3 -s
> [EP] $ sudo iperf3 -ub0 -c ${SERVER_IP} -l 65480 -w 512M -P 4
>
> Before (without this series):
> TCP / UDP : 602 Mbps / 598 Mbps
>
> Before (ntb_num_queues=1):
> TCP / UDP : 588 Mbps / 605 Mbps
What accounts for the dip in TCP performance?
>
> After (ntb_num_queues=2):
> TCP / UDP : 602 Mbps / 598 Mbps
>
> Notes:
> In my current test environment, enabling multiple queue pairs does
> not improve throughput. The receive-side memcpy in ntb_transport is
> the dominant cost and limits scaling at present.
>
> Still, this series lays the groundwork for future scaling, for
> example once a transport backend is introduced that avoids memcpy
> to/from PCI memory space on both ends (see the superseded RFC
> series:
> https://lore.kernel.org/all/20251217151609.3162665-1-den@xxxxxxxxxxxxx/).
>
>
> Best regards,
> Koichiro
>
> Koichiro Den (3):
> net: ntb_netdev: Introduce per-queue context
> net: ntb_netdev: Make queue pair count configurable
> net: ntb_netdev: Expose queue pair count via ethtool -l
>
> drivers/net/ntb_netdev.c | 326 +++++++++++++++++++++++++++------------
> 1 file changed, 228 insertions(+), 98 deletions(-)
>
for the series
Reviewed-by: Dave Jiang <dave.jiang@xxxxxxxxx>