Re: [RFC PATCH] usb: gadget: ncm: Fix handling of zero block length packets

From: Krishna Kurapati PSSNV
Date: Mon Feb 26 2024 - 21:40:39 EST




On 2/27/2024 3:26 AM, Maciej Żenczykowski wrote:
On Mon, Feb 26, 2024 at 3:28 AM Krishna Kurapati
<quic_kriskura@xxxxxxxxxxx> wrote:

While connecting to a Linux host with CDC_NCM_NTB_DEF_SIZE_TX
set to 65536, it has been observed that we receive short packets,
which come at interval of 5-10 seconds sometimes and have block
length zero but still contain 1-2 valid datagrams present.

According to the NCM spec:

"If wBlockLength = 0x0000, the block is terminated by a
short packet. In this case, the USB transfer must still
be shorter than dwNtbInMaxSize or dwNtbOutMaxSize. If
exactly dwNtbInMaxSize or dwNtbOutMaxSize bytes are sent,
and the size is a multiple of wMaxPacketSize for the
given pipe, then no ZLP shall be sent.

wBlockLength= 0x0000 must be used with extreme care, because
of the possibility that the host and device may get out of
sync, and because of test issues.

wBlockLength = 0x0000 allows the sender to reduce latency by
starting to send a very large NTB, and then shortening it when
the sender discovers that there’s not sufficient data to justify
sending a large NTB"

However, there is a potential issue with the current implementation,
as it checks for the occurrence of multiple NTBs in a single
giveback by verifying if the leftover bytes to be processed is zero
or not. If the block length reads zero, we would process the same
NTB infintely because the leftover bytes is never zero and it leads
to a crash. Fix this by bailing out if block length reads zero.

Fixes: 427694cfaafa ("usb: gadget: ncm: Handle decoding of multiple NTB's in unwrap call")
Signed-off-by: Krishna Kurapati <quic_kriskura@xxxxxxxxxxx>
---

PS: Although this issue was seen after CDC_NCM_NTB_DEF_SIZE_TX
was modified to 64K on host side, I still believe this
can come up at any time as per the spec. Also I assumed
that the giveback where block length is zero, has only
one NTB and not multiple ones.

drivers/usb/gadget/function/f_ncm.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c
index e2a059cfda2c..355e370e5140 100644
--- a/drivers/usb/gadget/function/f_ncm.c
+++ b/drivers/usb/gadget/function/f_ncm.c
@@ -1337,6 +1337,9 @@ static int ncm_unwrap_ntb(struct gether *port,
VDBG(port->func.config->cdev,
"Parsed NTB with %d frames\n", dgram_counter);

+ if (block_len == 0)
+ goto done;
+
to_process -= block_len;

/*
@@ -1351,6 +1354,7 @@ static int ncm_unwrap_ntb(struct gether *port,
goto parse_ntb;
}

+done:
dev_consume_skb_any(skb);

return 0;
--
2.34.1


In general this is of course fine (though see Greg's auto-complaint).

I haven't thought too much about this, but I just wonder whether the
check for block_len == 0
shouldn't be just after block_len is read, ie. somewhere just after:

block_len = get_ncm(&tmp, opts->block_length);

as it is kind of weird to be handling block_len == 0 at the point where
you are already theoretically done processing the block...

I guess, as is, this assumes the block isn't actually of length 0,
since there's a bunch of following get_ncm() calls...
Are those guaranteed to be valid?


I did get this doubt and tried it. I bailed out as soon as I found out block len is zero without actually processing the datagrams present and when I did that even ping doesn't work. Everything works only when the datagrams in this zero block len NTB are parsed properly.

I guess I don't actually see the infinite loop with block_len == 0,
since get_ncm() always moves us forward...


The infinite loop occurs because we keep moving the buffer pointer forward and keep processing the giveback until to_process variable becomes zero or one. In case block length is zero, we never move the buffer pointer forward and never reduce to_process variable and hence keep infinitely processing the same NTB over and over again.

Maybe your patch *is* correct as is, and you just need a comment
explaining *why* block_len == 0 is terminal at the spot you're adding the check.

Also couldn't you fix this without goto, by changing

} else if (to_process > 0) {
to
} else if (to_process && block_len) {
// See NCM spec. zero block_len means short packet.


I will test this out once (although I know that looking at it, it would definitely work) and send v2 with this diff.

Thanks for the review.

Regards,
Krishna,