Re: Linux performance on 21066 (UDB)

Thomas Pornin (bip@orion.ens.fr)
Sat, 7 Feb 1998 12:52:45 +0100


In article <199802070400.UAA02489@sun4.apsoft.com> you write:
>When I initally tried Linux on a UDB, I was very disappointed.

This is a complex issue.
First let's see the processor: 2 instructions per cycle, this is
good, but only if there is no resource conflict. When a register is
written in an instruction, its value cannot be used for a few cycles.
Moreover, the instruction set is rather limited; for instance, no
carry. The optimisation is a complex task (more than on a pentium
which has some embedded magic) and gcc is not completely aware of it.

As for the memory: in a UDB, the cache is too small. Only 256 Kbytes
in the UDB-166 model, and 512 Kbytes in the UDB-233. Considering
that code is bigger on an alpha than on a pentium, a 1 or 2 Mbytes
cache size would be much better.

And there is the disk problem: the harddisk you find in a UDB is
small, and really slow.

So here is what I did to my UDB-233 model to speed it up:
-- overclocked it to 266 MHz
-- added 16 Mbytes of ram, so that it has now 40 Mbytes
-- plugged in an external scsi harddisk (a 3 GB Quantum)
-- replaced gcc by egcs 1.0.1 (it is supposed to produce a slightly
better code)
-- compiled a recent 2.1.x kernel (it actually uses 2.1.85) to get
the dentry things

It now acts like some pentium 75 or so, which is not so bad after all.
It makes a very good router, or nfs server (with knfsd, I got
read and writes at 940 Kbytes/s on a 10baseT ethernet).
Occasionnaly it does MUCH better: in my lab, we develop some
cryptographic code, than can enciphers about 4 Mbytes per second on
a PPro (with a highly optimized assembly function). I could
achieve 4.7 Mbytes/s on my UDB. It is a matter of 64 bits operations.
Standard application do not use the 64 bits architecture, and in
fact run only on half of the machine.

To sum up: add memory, change the harddisk, upgrade the kernel.
If you really want computing power, buy a 21164 machine: 4 instructions
per cycle (when no conflict), latency reduced (you can reuse a register
after less cycles), much better cache (L1 and L2 onchip, big L3 on
motherboard), many cycles per second. A 21164 at 500 MHz with correct
hardware is much faster than any Intel-based machine in any occasion
(even compiles are faster).

And remember that whatever may happen, the UDB has a much better
hacking value than any PC.

--Thomas Pornin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu