Just to give some concrete numbers: A fully unrolled (upto 2048bytes)
csum_partial() does about 5% better than the current stock 686 code for
1480 sized lenghts. The size of the routine inreases from (iirc - i
measured this yesterday) 23x bytes to 23xx bytes...
[note that for short buffers, only a small part of the routine will
be executed though - so the icache footprint may not be much
bigger for these cases. still...]
> doing fast MMX TCP checksums is possible, even if the MMX engine doesnt
> have a carry logic, this is from a csum routine i wrote a year ago:
> demonstrates the method nicely), but i finally found that the FPU handling
> complexity is simply not worth it. More and more networking cards are
> doing IP checksumming anyway.
MMX is probably not worth is for the checksum alone, but for
the checksum© case could, maybe, be a win. Is your MMX code
available somewhere?
artur
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/