Subtle bit errors are generally very random and very hard to catch. On
machines without parity they also normally result in wrong answers rather
than noticed crashes.
> Just for kicks, I did run some more tests on NT and it didn't fail once.
> Does it have a mechanism for catching these hardware flaws and doing a
> retry or some such?
No but it doesnt hit the machine with the same patterns as Linux. A standard
PC has no useful mechanism for recovery from errors. Parity ram will trap
some memory errors (but not cache coherency or bus errors) and give an NMI.
You pay a serious premium for PC's with proper ECC ram and the like.
> I still would like to work through some of the other things outlined in
> the Sig11 page (ie. disabling cache, etc), but as this is my main machine,
> I can't be out of service for testing for too long.
Understood.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu