RE: NForce2 pseudoscience stability testing (2.6.0-test11)

From: Craig Bradney
Date: Thu Dec 04 2003 - 10:21:30 EST

As reported earlier today I had the first lockup this morning in over 5
days uptime. Having had that, I decided to go for the latest gentoo 2.6
test 11-r1 kernel. This means I was now running the following patches:
instead of
on top of vanilla test 11.

Anyway.. my two changes on rebuilding the kernel were initially:
-Add in preempt (because there were questions asked here)
-Remove generic IDE support (I know I have Nvidia IDE so lets only have
that one).

In that configuration the "multiple hdparm -t /dev/hda" test hung my

Rebuilt kernel without preempt.. no hang on hdparm test.

So in summary, apart from the patch changes as above, the only
difference is to my 5 day kernel I now dont have generic IDE support

So now, I'll leave it on and see how far it goes.. 13 mins so far :).


On Thu, 2003-12-04 at 13:17, b@xxxxxxxxxxxxx wrote:
> Dan Creswell wrote:
> Thanks for the input, I'll pass it to the list.
> Most of these Athlon victims are UP users, in fact, I
> believe they are exclusively UP. Does MPS 1.1/1.4 play a role
> in a UP system ever? I dont think the NForce2 chipset,
> where we are seeing these hard hangs (no ping, no screen,
> no blinking cursor, no toggling the caps lock, nothing) is
> capable of SMP operation.
> Now whats interesting is you finger the IDE as a potential
> culprit and think its very low level. Interesting.
> By the way, I've had trouble with SMP on a Tyan board with an
> i840 chipset with Linux before - I was never able to resolve
> the issue and had to return the board.
> I've beaten on an Intel SR1300 and SR2300 dual Xeon (aka
> Micron's Netframe 1610/2610 aka Sun 60x / 65x) and never run
> into these hangs with kernels up to 2.4.22. The motherboard
> is an Intel SE7501WV2 .
> >Hi,
> >Been following this thread silently for a while and thought
> >I'd drop you
> >a line as I have some other data you may find useful.
> >
> >My machine is a dual Xeon with 2Gb, E1000 NIC, MPT LSI SCSI
> >disks and an
> >
> >2.6-test9 is only stable on this machine with noapic passed in the
> >kernel parameters - otherwise, it lock's up in no time flat.
> >I can also
> >run this kernel in single-processor mode with the APIC enabled and
> >that's stable.
> >
> >2.4.23-rc2 runs fine on the same machine with the APIC enabled
> >in SMP mode.
> >
> >2.4.23-rc5 locks up on this machine if I use the same .config as
> for
> >-rc2. However, if I disable ACPI and pass "pci=noacpi" to the
> kernel,
> >this too runs fine.
> > - Seems like the ACPI changes in -rc3 are a problem for my
> machine.
> >
> >All of these behaviors have been observed with MPS 1.4 (I've
> changed
> >that BIOS setting to 1.1 today in preparation for more testing of
> the
> >above to see if that makes a difference).
> >
> >I mention all of this because none of my lock-ups have happened
> whilst
> >accessing the IDE subsystem. I *have* had lockups with
> simultaneous
> >network and disk access and I've also seen it with simply
> >mouse-waggling. I suspect that the problem is *very* low-level
> and
> >likely related to the interrupt load. In my case, the problems
> only
> >seem to occur with SMP configurations which makes me suspect there
> may
> >be a locking/simultaneous update problem.
> >
> >Oh, forgot to say, my motherboard is a Tyan Thunder S2665
> >(based on the
> >intel E7505 chipset).
> >
> >Hope that helps,
> >
> >Dan.
