Update: 2.0 kernels, tulip driver, crashes and reboots (long)

Al Youngwerth (alberty@apexxtech.com)
Mon, 11 Jan 1999 16:50:25 -0700


Here are the data points we've gathered over the weekend on our systems
that have been rebooting/locking:

The 9 systems that are running kernel 2.0.31 still have not locked or
rebooted. I'd have to say now that this is statistically significant. These
are a mixture of PNIC and ne2000 systems. Identical systems running 2.0.36
reboot an average of once every 8 unit days. These 9 systems have run over
6 days now (54 unit days without a crash).

We now appear to have a good method for inducing a reboot failure: running
netperf between two PNIC-tulip.89K/2.0.36 systems will cause a reboot in
12-24 hours. Throwing a hdparm -f -t /dev/hda3 into the background on the
same system will induce the failure reboot time to 5-15 minutes on the same
system.

My guess is that the VIA chipset is going into a state that ends up
asserting the processor reset signal. I just setup a digital o-scope to
trigger on the processor reset. After running the test for about 15
minutes, I plugged a USB keyboard into the client netperf system and it
rebooted (this is an interesting datapoint in itself). The digital o-scope
got the trigger on reset. We should have schematics for the motherboard
tommorrow to figure out exactly what is tied into reset to trace it back.
Epox is also contacting VIA about erratas for the chipset and should get
back to us tommorrow.

Other tests we're starting today:

Run netperf/hdparm test on Intel TX-based motherboards with
PNIC-tulip.89K/2.0.36 (running now, no failures after 2 hours)

Run netperf/hdparm test on VPX-based motherboards with ne2000/2.0.36
(running now, no failures after 2+ hours)

Run netperf/hdparm test on VPX-based motherboards with PNIC-tulip.89K/2.0.31

Run netperf test on VPX-based motherboards with Win98/PNIC-driver

I think we're getting close on this. Your ideas and comments on the above
are welcome and appreciated.

Thanks,

Al Youngwerth
alberty@apexxtech.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/