If you get a usable tcpdump log send it to netdev@nuclecu.unam.mx
>
> I have data to indicate the kernel getting or assembling the headers and
> data segments exactly TWO frames out of sequence. That has explained every
> csum (TCP csum) failure I have captured. I suspect it may be a wild race
> condition -- an interrupt at an odd time (those are kinda hard to localize.)
What do you mean with 'headers' and 'data segments' here? A bug in the IP-level
fragmentation or in the TCP reassembly algorithms? This probabality is very
low for both things: ICMP messages are never fragmented, so a fragment
reassembly bug can't cause this (in theory a very buggy, non rfc compliant
implementation could send fragmented ICMPs, but that's unlikely). TCP
packets are usually not fragmented neither, unless you're dealing with a
buggy implementation that doesn't do path mtu discovery properly (possible,
but unlikely). When the TCP segment reassembly has a bug it'll never
cause bad checksum messages, because TCP packets with a wrong checksum
are thrown away early and the reassembly algorithm never sees them. If there
really is a bug there it could be only catched by additional checksums
at the application level, but I know of no protocol who does this.
-Andi