Re: IPv4 kernel messages

Oliver Xymoron (oxymoron@waste.org)
Wed, 9 Sep 1998 10:41:21 -0500 (CDT)


On 9 Sep 1998, H. Peter Anvin wrote:

> > I looked at this back when the MMX extensions were announced and there
> > seemed to be a way to do it. IIRC, there's an instruction which will
> > extend a register full of n-bit values into 2 registers full of 2n-bit
> > values - sort of like sign extend. This lets you do all your math in 16
> > bit so carry or branch logic is no longer an issue. So you use one MMX
> > register as 4 16-bit accumulators, one as a load target for 8 8-bit
> > values, and two as buffers for zero extending those values. Then
> > processing 8 characters can be (no idea what the opcodes are anymore)
> >
> > load *p to a (1 cycle)
> > extend a to b and c (I think this is two 1 cycle instructions)
> > add b to d (1 cycle)
> > add c to d (1 cycle)
> > increment p (1 cycle)
> >
> > ...which results in four partial results in d which can be merged at the
> > end of the loop into the final checksum.
> >
>
> Well, the issue is that IP checksums are 1's-complement; you actually
> have to add the final carry to the sum. The advantage with this is
> that the sum is byte-order independent.

According to Stevens, I had another detail wrong as well - the checksum is
16-bit 1's-complement of 16-bit words, not 8-bit words. This is still
manageable, I think.. We can mimic the sums generated by csum_partial by
reading in 64-bits, splitting it into two registers, and calculating a 64
bit checksum. This shouldn't overflow for any packets we're likely to
handle.. 32-bit carries can be folded back into the lower half of the sum
at the end. Same number of operations as before in the inner loop.

--
 "Love the dolphins," she advised him. "Write by W.A.S.T.E.." 

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/faq.html