Re: Catching NForce2 lockup with NMI watchdog

From: Craig Bradney
Date: Fri Dec 05 2003 - 09:22:21 EST


I'm getting those in dmesg too...

..MP-BIOS bug: 8254 timer not connected to IO-APIC
...trying to set up timer (IRQ0) through the 8259A ... failed.
...trying to set up timer as Virtual Wire IRQ... failed.
...trying to set up timer as ExtINT IRQ... works.


Do you really think this could be the problem?

If so, any ideas why I am relatively lucky to not have the crashes
people are having? 5.5 days, then 5 hours, and now Im up to 17 hours...
with a decent amount of use combined with idle time.

Craig


On Fri, 2003-12-05 at 13:14, Mikael Pettersson wrote:
> Josh McKinney writes:
> > On approximately Fri, Dec 05, 2003 at 08:40:58AM +0100, Mikael Pettersson wrote:
> > > Jesse Allen writes:
> > > > Hi,
> > > >
> > > > I have a NForce2 board and can easily reproduce a lockup with grep on an IDE
> > > > hard disk at UDMA 100. The lockup occurs when both Local APIC + IO-APIC are
> > > > enabled. It was suggested to me to use NMI watchdog to catch it. However, the
> > > > NMI watchdog doesn't seem to work.
> > > >
> > > > When I set the kernel parameter "nmi_watchdog=1" I get this message in
> > > > /var/log/syslog:
> > > > Dec 4 20:10:30 tesore kernel: ..MP-BIOS bug: 8254 timer not connected to
> > > > IO-APIC
> > > > Dec 4 20:10:30 tesore kernel: timer doesn't work through the IO-APIC -
> > > > disabling NMI Watchdog!
> > > >
> > > > "nmi_watchdog=2" seems to work at first, In /var/log/messages:
> > > > Dec 4 20:13:11 tesore kernel: testing NMI watchdog ... OK.
> > > > but it still locks up.
> > >
> > > The NMI watchdog can only handle software lockups, since it relies on
> > > the CPU, and for nmi_watchdog=1 the I/O-APIC + bus, still running.
> > > Hardware lockups result in, well, hardware lockups :-(
> >
> > So does this confirm that the lockups with nforce2 chipsets and apic
> > is actually a hardware problem after all?
>
> Confirm with very high probability. There may be quirks in nVidia's
> chipset that we (unlike their Windoze drivers) don't know about.
>
> Ask nVidia for detailed chipset documentation. Then maybe we can fix this.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/