In my own biased/etc opinion, the overclocking isn't an issue here. The
load of the machine when it's crashed has probably been about 0.40
(remember, a dual processor machine reads 2.00 when both processors are
busy, not 1.00). I've had the machine stressed much higher than that (make
-j of kernel sources, benchmarks, etc) for CPU, disk, and network
activity.
The physical machine is sitting in a 6x deg cold room with OEM heatsink
and fan attachments (with a good deal of heatsink compound which I added),
and 2 fans (220V West German metal fans, each used to cool an enclosure
for four 5.25" FH drives and a power supply) pointed directly at them. The
CPU temperature shouldn't hit more than 30deg C (the lm78 says 24-26, but
it only reports one temperature for 2 processors?!) Lastly, it's a 10%
overclock for the processors, and the bus is documented to work at 75Mhz.
SMP is known to be touchy in 2.0.x still, and when you combine NFS, md,
and SCSI, each of which has in the past been known for problems with SMP,
there may still be problems. This is my uneducated uninformed speculative
guess.
(Sorry in advance, if this is a bit ranting/raving...)
-Rob H.
On Fri, 14 Nov 1997, Douglas Eadline wrote:
> The first thing I would do is "take your foot off the accelerator"
> and see if you problems go away. (i.e. turn off the overclocking).
> Everything else is just a guess until this variable is eliminated.
>
> Doug Eadline
> -------------------------------------------------------------------
> Paralogic, Inc. | PEAK | Voice:+610.861.6960
> 115 Research Drive | PARALLEL | Fax:+610.861.8247
> Bethlehem, PA 18017 USA | PERFORMANCE | http://www.plogic.com
> -------------------------------------------------------------------
>