Re: IP Checksumming

Richard B. Johnson (root@analogic.com)
Thu, 21 Nov 1996 20:04:47 -0500 (EST)


On Thu, 21 Nov 1996, Steve VanDevender wrote:

>
> Richard B. Johnson writes:
> > > > Further, the present routines in ../../asm
> > > > don't take advantage of the Intel architecture.
> > >
> > > You mean the lodsw and loop instructions? Those have been losers
> > > since the 386.
> >
> > Not true. These built-in macros are responsible for much of the performance
> > improvements over chips such as the 68k providing the developer took the
> > time to use them.
>
> Have you actually read the timings? On the 486 and up, it's faster to
> do:
>
> move.b %al, (%edi)
> inc.l %edi
>
> than it is to do
>
> stos.b
>
If you have to add an immediate value (like 2 for words), you lose.
The rep movsX are not perfect for everything. Also, for short copies
there is often too much overhead setting up registers adead of time.

Also, If the 'C' compiler is not consistent in its register use, ASM
routines will incur the overhead of having to save register contents
before using registers for the build-in macros.

However, when it comes to checksumming 1500 bytes, arranged as the WORD
funny-checksum, used for TCP/IP, I believe these macros are faster.

The Pentium has a built-in cycle counter that can be used to prove this.
Opcode 0x0f, 0x31 loads eax:edx with the current cycle-count. You
can time your own routines without worrying about clock-speeds, etc.

Cheers,
Dick Johnson
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Richard B. Johnson
Project Engineer
Analogic Corporation
Voice : (508) 977-3000 ext. 3754
Fax : (508) 532-6097
Modem : (508) 977-6870
Ftp : ftp@boneserver.analogic.com
Email : rjohnson@analogic.com, johnson@analogic.com
Penguin : Linux version 2.1.11 on an i586 machine.
Warning : It's hard to remain at the trailing edge of technology.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-