Re: Dhrystones (fwd)

Linus Torvalds (Linus.Torvalds@cs.helsinki.fi)
Tue, 22 Aug 1995 00:25:39 +0300


Jim Nance: "Dhrystones (fwd)" (Aug 19, 7:42):
> > My own: 101351 (gcc -O5 -fomit-frame-pointer)
> > 94339 (gcc -O3 -fomit-frame-pointer)
> >
> > Jim's precompiled binaries: (Compiled under OSF1, but benched under Linux)
> >
> > dhry.cc.migrate: 192307 (cc -migrate -O5 -ansi_alias -ifo -non_shared)
> > dhry.gcc.2.6.3 202702 (gcc -O3 -fomit-frame-pointer)
>
> Did it actually take twice as long to run the binaries compiled under
> Linux? I am wondering if the difference has anything to do with the value
> of HZ or some other timer that may be different between OSF/1 and Linux.

It might be the HZ value (I'm not at my alpha now, so I can't chek what
OSF/1 uses), but I suspect it's simply a library issue. I don't think
caches should matter _that_ much for dhrystone (I think it should fit
even in a 256kB cache), but I seem to remember dhrystone being very
sensitive to the performance of such library routines as "strcpy()". So
assuming the OSF/1 library is much more optimized for things like that
(and I'm willing to bet it is), the two-fold increase in performance
isn't totally unlikely.

I'm in Santa Cruz right now, and the line is very slow, so I don't want
to try to find any linux libc.a or dhrystone sources. But if the linux
libc.a does the straightforward but stupid implementation of strcpy(),
you'll easily see a _huge_ difference to a handcrafted assembly version
that does the copy and zero-compare 8 bytes at a time.

(similar comments hold for strcmp/memcmp etc: strlen and memcpy should
be reasonably optimized already)

Linus