On Fri, Feb 26, 2021 at 01:54:00PM +0100, Hannes Reinecke wrote:[ .. ]
On 2/26/21 1:35 PM, Daniel Wagner wrote:
On Mon, Feb 15, 2021 at 01:29:45PM -0800, Sagi Grimberg wrote:
Well, I think we should probably figure out why that is happening first.
I got my hands on a tcpdump trace. I've trimmed it to this:
Oh, I am fully aware.NVM Express Fabrics TCPAs I suspected, we did receive an invalid frame.
Pdu Type: CapsuleResponse (5)
Pdu Specific Flags: 0x00
.... ...0 = PDU Header Digest: Not set
.... ..0. = PDU Data Digest: Not set
.... .0.. = PDU Data Last: Not set
.... 0... = PDU Data Success: Not set
Pdu Header Length: 24
Pdu Data Offset: 0
Packet Length: 24
Unknown Data: 02000400000000001b0000001f000000
0000 00 00 0c 9f f5 a8 b4 96 91 41 16 c0 08 00 45 00 .........A....E.
0010 00 4c 00 00 40 00 40 06 00 00 0a e4 26 af 0a e4 .L..@.@.....&...
0020 c2 1e 11 44 88 4f b8 58 90 ec 8e 1b 32 ed 80 18 ...D.O.X....2...
0030 01 01 fe d3 00 00 01 01 08 0a e6 ed ac be d6 a3 ................
0040 5d 0c 05 00 18 00 18 00 00 00 02 00 04 00 00 00 ]...............
0050 00 00 1b 00 00 00 1f 00 00 00 ..........
Data digest would have saved us, but then it's not enabled.
So we do need to check if the request is valid before processing it.
That's just addressing a symptom. You can't fully verify the request is
valid this way because the host could have started the same command ID
the very moment before the code checks it, incorrectly completing an
in-flight command and getting data corruption.