Re: [PATCH v3] usb: gadget: ncm: Avoid dropping datagrams of properly parsed NTBs
From: Junzhong Pan
Date: Sat Jan 11 2025 - 05:32:06 EST
Hi Maciej
Thanks for the reply, I am sorry for the unclear description.
+remove hgajjar@xxxxxxxxxxxxxx from CC list due to mail delivery failure.
On 2025/1/11 3:00, Maciej Żenczykowski wrote:
> On Thu, Jan 9, 2025 at 11:37 PM Junzhong Pan <panjunzhong@xxxxxxxxxxx> wrote:
>> Lecroy shows the wSequence=5765 have 10 Datagram consisting a 31*512
>> bytes=15872 bytes OUT Transfer but have no ZLP:
>>
>> OUT Transfer wSequence=5765
>> NTH32 Datagrams: 1514B * 8 + 580B NDP32
>> Transfer length: 512B * 31
>> NO ZLP
>> OUT Transfer wSequence=5766
>> NTH32 Datagrams: 1514B * 8 NDP32
>> Transfer length: 512B * 29 + 432
>>
>> This lead to a result that the first givebacked 16K skb correponding to
>> a usb_request contains two NTH but not complete:
>>
>> USB Request 1 SKB 16384B
>> (NTH32) (Datagrams) (NDP32) | (NTH32) (Datagrams piece of wSequence=5766)
>> USB Request 2 SKB 14768B
>> (Datagrams piece of wSequence=5766) (NDP32)
>>
>> From the context, it seems the first report of Wrong NDP SIGN is caused
>> by out-of-bound accessing, the second report of Wrong NTH SIGN is caused
>> by a wrong beginning of NTB parsing.
>
> Could you clarify which Linux Kernel you're testing against?
I am using linux 6.6.63, but the related codes have not newer updates since.
> Could you provide some pcap of the actual usb frames?
> Or perhaps describe better the problem, because I'm not quite
> following from your email.
> (I'm not sure if the problem is what windows is sending, or how Linux
> is parsing it)
I think the root cause of the problem is because Windows 10 ncm class
driver doesn't send ZLP, meanwhile the current parsing is depend on a condition
that NTB won't spread across usb_request.
This is observed only on Windows 10 but not on Windows 11.
To reproduce the issue, you just need a Windows 10 machine and run iperf3 -c
from the windows and iperf3 -s on the gadget board, configure the gadget
with the following os_desc, windows10 will bind its ncm driver automatically:
echo 0xEF > $GADGET/bDeviceClass
echo 0x02 > $GADGET/bDeviceSubClass
echo 0x01 > $GADGET/bDeviceProtocol
echo 1 > $GADGET/os_desc/use
echo 0x1 > $GADGET/os_desc/b_vendor_code
echo "MSFT100" > $GADGET/os_desc/qw_sign
mkdir -p $FUNCTIONS/ncm.0/os_desc/interface.ncm
echo WINNCM > $FUNCTIONS/ncm.0/os_desc/interface.ncm/compatible_id
Reproduced on dwc2 and dwc3 controller with linux 6.6.63.
> I *think* what you're saying is that wSequence=5765 & 5766 are being
> treated as a single ncm message due to their being a multiple of 512
> in the former, not followed by a ZLP? I thought that was precisely
> when microsoft ncm added an extra zero byte...
>
> What's at the end of 5755? Padding? No padding?
> Is there an NTH32 header in 5766? Should there be?
>
Okay, I will explain it more precisely (some numbers are corrected).
On the USB wire, the transfers on OUT endpoint is like this:
OUT Transfer #1 issued by win10, Transfer length: 512B * 31 = 15872 Bytes
OUT Transaction 512B for 31 times, no ZLP and other things. Parsed as below:
[Offset 0x0000] A NTH32 Header with wSequence=5765, dwNdpIndex=0x3D90
[Offset 0x0012] Datagram blocks
[Offset 0x3D90] A NDP32 describing 11 Datagram blocks plus one zero length item and padding, len = 112 Bytes.
(0)index=0x0012 length=1514
(1)index=0x05FE length=1514
(2)index=0x0BEA length=1514
(3)index=0x11D6 length=1514
(4)index=0x17C2 length=1514
(5)index=0x1DAE length=1514
(6)index=0x239A length=1514
(7)index=0x2986 length=1514
(8)index=...... length=1514
(9)index=...... length=1514
(10)index=..... length=580
(11)index=0x0 length=0
...padding...
OUT Transfer #2 issued by win10, Transfer length: 512B * 29 + 432 = 15280 Bytes
OUT Transaction 512B for 29 times, and OUT Transaction 432B for one time.
[Offset 0x0000] A NTH32 with wSequence=5766, dwNdpIndex=0x3B48
[Offset 0x0012] Datagram block
[Offset 0x3B48] A NDP32 describing Datagram blocks
In the u_ether driver, we first queued a usb_request(buf=skb.data, length=16384)
to the udc driver, the udc should give it back when it receive a 16384B data or
encounter ZLP/Short packet.
Since the OUT Xfer #1 doesn't have ZLP or SP, the udc won't giveback the usb_request
with req->actual = 15872, instead, it gather some piece of the data from OUT Transfer #2
to make a req->actual=16384 then return to rx_complete.
For f_ncm, we now have a the first usb_request givebacked from udc driver having
skb->data organized as the following structure:
usb_request #1
-----------------------from OUT Transfer #1:-------------------
[Offset 0x0000] A NTH32 Header with wSequence=5765, dwNdpIndex=0x3D90
[Offset 0x0012] Datagram blocks
[Offset 0x3D90] A NDP32 describing 10 Datagram blocks plus one zero length item and padding, len = 112 Bytes.
(0)index=0x0012 length=1514
(1)index=0x05FE length=1514
(2)index=0x0BEA length=1514
(3)index=0x11D6 length=1514
(4)index=0x17C2 length=1514
(5)index=0x1DAE length=1514
(6)index=0x239A length=1514
(7)index=0x2986 length=1514
(8)index=...... length=1514
(9)index=...... length=1514
(10)index=..... length=580
(11)index=0x0 length=0
...padding...
--------------------from OUT Transfer #2:-------------------
[Offset 0x3E00/15872] A NTH32 Header with wSequence=5766, dwNdpIndex=0x3B48
[Offset 0x3E00+???] Datagram block piece from next NDP (wSequence=5766)
During the unwrap, we will first report a "Wrong NTH SIGN" when we try to redirect to
the NDP of wSequence=5765 parsing the second NTB in the usb_request, since theie is no
bound checking, we read some garbage data from skb->data + [0x3E00 + 0x3B48].
Raising the following message (without modification):
[ 174] configfs-gadget.0: Wrong NDP SIGN
After this, we go to the next usb_request, at this time, we have a short package(432B) from
the OUT Transfer #2, thus the udc would giveback next usb_request to us like this:
usb_request #2
--------------------from OUT Transfer #2:-------------------
[Offset 0x0000] Datagram block piece from NTB of wSequence=5765
3f 98 a6 8e 17 f8 bb 29 07 b8 da 13 7f 20 80 8e 77 ca 32 07 ac 71 b8 8d 84 03 d7 1b 96 9b c4 fa
[Offset 0x????] A NDP32 describing Datagram blocks
At this point, the unwrap function try to read a NTH sign but got unexpected user data,
then it raise this message:
[ 174] configfs-gadget.0: Wrong NTH SIGN, skblen 14768
[ 174] HEAD:00000000b1a72bfc: 3f 98 a6 8e 17 f8 bb 29 07 b8 da 13 7f 20 80 8e 77 ca 32 07 ac 71 b8 8d 84 03 d7 1b 96 9b c4 fa
In short, there is two NTBs across two usb_requests, the beginning of a usb_request is
not necessary a NTH sign.
Do this make sense to you?
Best Regards,
Pan