Re: [PATCH] mctp i2c: check packet length before marking flow active

From: Jeremy Kerr

Date: Fri Apr 24 2026 - 00:16:59 EST


Hi William,

> > Out of curiosity though, how did you hit the hdr_byte_count mismatch in
> > the first place?
>
> Our current theory is that we have known buggy firmware on our NVME MCTP
> devices and we are seeing some kind of corruption on the bus that we are
> going to fix in on the firmware side.

OK, sounds good for the overall fix, but I don't think that would be
causing the path that you're addressing here. The fix is definitely
valid, but can't be hit through any RX data corruption (we're in the
TX path).

The header byte count is populated during header construction, so a
mismatch here would indicate modification of the skb between that point
at the actual xmit. Do you see the "Bad TX len" warning in these cases?

> We started also seeing kernel
> crashes along with the bad firmware symptoms, walked through ~110 kdumps
> and found i2c locks that were held by 2 owners (eeprom reading and the
> MCTP TX queue).

Just to clarify my understanding of the state: "being held by two
owners" would indicate a violation of the lock itself. Or is it that
there are two threads blocked waiting to acquire the mutex?

For NVMe-MI, you're likely using manual tag allocation, where the tag
allocation (and hence flow state) is entirely controlled by userspace.
It may be that the NVMe protocol-level errors are causing that tags to
be held for long durations, perhaps?

Cheers,


Jeremy