Re: [PATCH 2/2] usb: gadget: ncm: Add support to update wMaxSegmentSize via configfs

From: Krishna Kurapati PSSNV
Date: Sun Oct 15 2023 - 23:48:25 EST




On 10/16/2023 6:49 AM, Maciej Żenczykowski wrote:


Hmm, I'm not sure. I know I've experimented with high mtu ncm in the
past
(around 2.5 years ago). I got it working between my Linux desktop (host)
and a Pixel 6 (device/gadget) with absolutely no problems.

I'm pretty sure I didn't change my desktop kernel, so I was probably
limited to 8192 there
(and I do more or less remember that).
From what I vaguely remember, it wasn't difficult (at all) to hit
upwards of 7gbps for iperf tests.
I don't remember how close to the theoretical USB 10gbps maximum of
9.7gbps I could get...
[this was never the real bottleneck / issue, so I didn't ever dig
particularly deep]

I'm pretty sure my gadget side changes were non-configurable...
Probably just bumped one or two constants...

Could you share what parameters you changed to get this high value of
iperf throughput.

Eh, I really don't remember, but it wasn't anything earth shattering.
From what I recall it was just a matter of bumping mtu, and tweaking
irq pinning to stronger cores.
Indeed I'm not even certain that the mtu was required to be over 5gbps.
Though I may be confusing some things, as at least some of the testing was done
with the kernel's built in packet generator.


I do *very* *vaguely* recall there being some funkiness though, where
8192 was
*less* efficient than some slightly smaller value.

If I recall correctly the issue is that 8192 + ethernet overhead + NCM
overhead only fits *once* into 16384, which leaves a lot of space
wasted.
While ~7.5 kb + overhead fits twice and is thus a fair bit better.
Right, same goes for using 5K vs 5.5K MTU. If MTU is 5K, 3 packets can
conveniently fit into an NTB but if its 5.5, at max only two (5.5k)
packets can fit in (essentially filling ~11k of the 16384 bytes and
wasting the rest)

Formatting gone wrong. So pasting the first paragraph again here:

"Right, same goes for using 5K vs 5.5K MTU. If MTU is 5K, 3 packets can
conveniently fit into an NTB but if its 5.5, at max only two (5.5k)
packets can fit in (essentially filling ~11k of the 16384 bytes and
wasting the rest)"


And whether its Ipv4/Ipv6 like you mentioned on [1], the MTU is what NCM
layer receives and we append the Ethernet header and add NCM headers and
send it out after aggregation. Why can't we set the MAX_DATAGRAM_SIZE to
~8050 or ~8100 ? The reason I say this value is, obviously setting it to
8192 would not efficiently use the NTB buffer. We need to fill as much
space in buffer as possible and assuming that each packet received on
ncm layer is of MTU size set (not less that that), we can assume that
even if only 2 packets are aggregated (minimum aggregation possible), we
would be filling (2 * (8050 + ETH_HLEN) + (room for NCM headers)) would
almost be close to 16384 ep max packet size. I already check 8050 MTU
and it works. We can add a comment in code detailing the above
explanation and why we chose to use 8050 or 8100 as MAX_DATAGRAM_SIZE.

Hope my reasoning of why we can chose 8.1K or 8.05K makes sense. Let me
know your thoughts on this.

Maybe just use an L3 mtu of 8000 then? That's a nice round number...
But I'm also fine with 8050 or 8100.. though 8100 seems 'rounder'.

I'm not sure what the actual overhead is... I guess we control the
overhead in one direction, but not in the other, and there could be
some slop, so we need to be a little generous?


Hi Maciej,

Sure. Let's go with 8000 to leave some space for headers. And would add the following paragraph as comment for readers to understand why this value was set:

"Although max mtu as dictated by u_ether is 15412 bytes, setting max_segment_size to 15426 would not be efficient. If user chooses segment size to be (> 8192), then we can't aggregate more than one buffer in each NTB (assuming each packet coming from network layer is > 8192 bytes) as ep maxpacket limit is 16384. So let max_segment_size be limited to 8000 to allow atleast 2 packets to be aggregated reducing wastage of NTB buffer space"

Hope that would be fine.

Regards,
Krishna,