Re: IP Checksumming

Torbjorn Lindgren (
Sat, 23 Nov 1996 17:39:35 +0100 (MET)

On Fri, 22 Nov 1996, Richard B. Johnson wrote:
> > My numbers agree with Tom May's give or take 5%. His code craps all
> > over yours both with the data in the cache, in the L2 cache or in
> > main memory (generally its in L1 cache, but with DMA devices its in main
> > memory).
> >
> That's what I really enjoy about persons who know everything. They usually
> show great intellectual capacity. Use the Pentium's internal cycle-counter
> to test something before you state "His code craps all over yours...".

*Exactly* what is wrong in testing the good old way, by checksuming a
large datablock and measure the time it takes?! Answer: ** NOTHING **

If one algorithm is 10x slower in that test there is no *REASON* to test
it further, just scrap the slower one or rewrite it from scratch.

> As I stated in the beginning of the conversation, that is now getting way
> too tired, test whatever you want with the cycle-counter and the interrupts
> OFF so you test only that code. Then repeat your last statement.

Why? There is no reason at *ALL* to do that.

Anyway, have you followed the comp.arch/comp.lang.c thread about this?
(TCP/IP checksum)

Excerpts from a port by Terje Mathiesen (, in
the article with message id: <> posted 8 Nov):

: The algorithm they use is nearly identical to the one I worked out a
: few years ago, and which has obviously been used several times before,
: on many different platforms.
: The Linux asm version is nearly optimal, wasting one pipeline slot
: every three instructions, so it can still be speeded up by another 33%.
: My only claim to fame here is that I got the idea independently, and
: managed to schedule everything optimally so that I have no wasted
: pipeline slots anywhere on a Pentium.

Torbjörn Lindgren
Funcom Oslo AS, Langkaia 1, N-0150 Oslo, Norway
If Santa ever DID deliver presents on Christmas Eve, he's dead now.