performance stuff [Re: Kernel Rebuild & Status...]

David Mosberger (davidm@AZStarNet.com)
Thu, 17 Aug 1995 15:03:45 -0700


Jeff,

You write:

[Deleita about AXPpci33 board]
> One reason could be: L2 cache of 256 KB is too less, should be 1MB !

I don't think it would matter much. The memory bus runs at 33 MHz according
to the specs I have. Seems rather silly for a CPU this fast.

(a) It's the other way round: the bigger the bandwidth discrepancy
between main-memory and CPU, the *more* cache you want. I don't know
the benchmark that was used, but if 256KB vs. 1MB makes the difference
between the benchmark fitting into or exceeding the cache, a
performance difference of an order of magnitude is not unusual. Also,
the benchmark may suffer from conflict-misses (the caches are
direct-mapped). In such a case, it often is sufficient just to
reorganize the data a little to gain significantly more performance.

(b) The 33MHz you mention most likely refer to the PCI bus speed. The
memory-controller of the LCA is independent of the PCI controller
(except for maintaining cache-coherency, which is a nifty feature).
Both, the memory-signals and the b-cache signals are programmable in
the LCA. The peak-memory bandwidth I have seen quoted for the Noname
(I think it was in the Design Guide) was 80MB/s.

[Chop]
> After this, I got an Alpha box (Sirius275), with 2MB L2 Cache, 275 Mhz
> and 64 MB memory. And this was, when computing really got fast !
> This box was in the XLispStat test factor 3 faster than the Alpha 166 box.
> (it is significantly more expensive, too ;-))

21064 chip, much faster memory bus. The cache is actually given a chance
to work.

More likely: bigger & faster cache (2MB @ 12ns) and wider memory-bus
(128bit vs. 64bit), i.e., the CPU gets to wait less often for memory
accesses. Of course, the memory-system is what makes those boxes more
pricey (and, I'd venture, for exactly the same reasons there are
low-cost Pentium boxes and not-so-low-cost ones). The fact these days
is just that: the performance of low-end systems is in practice much
more affected by memory- and i/o-performance than by CPU performance
(now what does that say about cheap multiprocessors? :).

One note: when doing performance measurements on Linux/Alpha, please
bear in mind that (as far as I know) neither the kernel nor the
library have experienced much performance tuning as of yet. There are
lots of places that could improve, but, right now, most people are
busy getting the functionality into the system. Compute-bound
benchmarks may not be that sensitive to such factors, but
system-benchmarks certainly are.

And, once again: I still think that Linux/Alpha should also provide a
user-level environment with 32-bit pointers/longs. This can reduce
both memory-consumption and memory-bandwidth requirements. But before
this happens, somebody needs to find the time and actually do it.

I'd bet XLispStat did so bad because all Lisp cells suddenly were
twice as big. Lisp pretty much is a worst case for a system with
64-bit pointers (c.f., Jeffrey Mogul's for more info on this topic, I
think it's available as a TR from DEC WRL).

Just my 2 cents.

--david