Re: [PATCH v3 1/3] net/smc: bound the wire-controlled producer cursor to the RMB
From: Bryam Vargas
Date: Thu Jun 18 2026 - 18:11:54 EST
On Thu, 18 Jun 2026 22:29:20 +0800, Dust Li wrote:
> once we detect that the peer is misbehaving, I think the right action is
> to abort the connection and record the event, rather than silently clamp.
[...]
> u32 prod_count = ntohs(cdc->prod.count);
> ...
> cdc->prod.wrap > 1 || cdc->cons.wrap > 1) {
Thanks for taking a look, Dust. I'm on board with the direction for net-next --
aborting and recording a bad CDC is cleaner than clamping something we already know
we can't trust, and as you say, the clamp just papers over the peer bug. So: minimal
clamp stays for -stable, and net-next gets the wire-boundary check + abort (through
abort_work, with an smc_stats counter and a ratelimited warn).
A few things I ran into on the check itself, though:
- count is __be32, so it wants ntohl() rather than ntohs() -- ntohs() ends up reading
the wrong half.
- I'd drop the wrap > 1 tests. wrap is a free-running counter (smc_curs_add does
wrap++), so a connection that legitimately wraps its RMB ends up with wrap > 1; and
since it's a __be16 read raw, on little-endian wrap==1 already reads as 0x0100 and
we'd abort on the very first wrap. I don't think there's a sane upper bound to put
on wrap.
- the check is typed for SMC-R, but the SMC-D path hands a host-order smcd_cdc_msg to
smc_cdc_msg_recv() cast as smc_cdc_msg (smc_cdc.c:456), so ntohl/ntohs would
double-swap it there. The simplest thing I found is one check on the host cursor
right after smc_cdc_msg_to_host(), before the diff/atomic_add block -- that covers
SMC-R and SMC-D in one place.
Minor: >= len rather than > len (count is an offset in [0,len)), and peer_rmbe_size
is signed so worth guarding. The cons vs peer_rmbe_size bound looks right to me.
Happy to spin it whichever way you prefer.
Bryam