Re: tickle nmi watchdog whilst doing serial writes.

From: Ed Tomlinson
Date: Sat May 14 2005 - 05:33:33 EST


Hi,

Suspect this can be triggered using alt+sysreq+T on a busy system with a slow
serial console. Might be a easy way to see if this patch fixes the issue?

Ed Tomlinson

On Saturday 14 May 2005 03:07, Andrew Morton wrote:
> Dave Jones <davej@xxxxxxxxxx> wrote:
> >
> > On Fri, May 13, 2005 at 11:43:31PM -0700, Andrew Morton wrote:
> > > Dave Jones <davej@xxxxxxxxxx> wrote:
> > > >
> > > > This was fun. I inserted a music CD with some obnoxious copy-protection
> > > > on it into the drive, and lots of SCSI errors went zipping over to
> > > > the serial console. Unfortunatly, the box was also compiling a kernel,
> > > > playing oggs, and doing a number of other things at the same time,
> > > > so this happened..
> > > >
> > > > NMI Watchdog detected LOCKUP on CPU2CPU 2
> > >
> > > OK.. But calling touch_nmi_watchdog() at 1MHz seems a bit excessive, and
> > > might perturb the finely-tuned timing in there.
> > >
> > > How's about this?
> >
> > Umm.. Despite it being past my bedtime, I'm pretty sure I'm
> > missing something here...
> >
> > > + while (!(serial_in(up, UART_MSR) & UART_MSR_CTS) && --tmout)
> > > udelay(1);
> >
> > I don't see how this is any better than the current code.
> > We're doing 1000000 udelays. Whilst we're doing that,
> > the nmi watchdog goes bonkers.
> >
> > > + if (tmout < 1000000)
> > > + touch_nmi_watchdog();
> >
> > So by the time we do this, its already triggered.
>
> But the NMI watchdog won't expire after one second - normally it's set to
> fixe seconds.
>
> > How about..
> >
> > --- linux-2.6.11/drivers/serial/8250.c~ 2005-05-14 02:49:02.000000000 -0400
> > +++ linux-2.6.11/drivers/serial/8250.c 2005-05-14 02:54:30.000000000 -0400
> > @@ -40,6 +40,7 @@
> > #include <linux/serial_core.h>
> > #include <linux/serial.h>
> > #include <linux/serial_8250.h>
> > +#include <linux/nmi.h>
> >
> > #include <asm/io.h>
> > #include <asm/irq.h>
> > @@ -2099,8 +2100,15 @@ static inline void wait_for_xmitr(struct
> > if (up->port.flags & UPF_CONS_FLOW) {
> > tmout = 1000000;
> > while (--tmout &&
> > - ((serial_in(up, UART_MSR) & UART_MSR_CTS) == 0))
> > + ((serial_in(up, UART_MSR) & UART_MSR_CTS) == 0)) {
> > + int cnt=0;
> > udelay(1);
> > + cnt++;
> > + if (cnt==100) {
> > + touch_nmi_watchdog();
> > + cnt=0;
> > + }
> > + }
>
> <obwhitespacewhine> spose so.
>
> --- 25/drivers/serial/8250.c~tickle-nmi-watchdog-whilst-doing-serial-writes 2005-05-14 00:03:09.000000000 -0700
> +++ 25-akpm/drivers/serial/8250.c 2005-05-14 00:06:53.000000000 -0700
> @@ -40,6 +40,7 @@
> #include <linux/serial_core.h>
> #include <linux/serial.h>
> #include <linux/serial_8250.h>
> +#include <linux/nmi.h>
>
> #include <asm/io.h>
> #include <asm/irq.h>
> @@ -2098,9 +2099,11 @@ static inline void wait_for_xmitr(struct
> /* Wait up to 1s for flow control if necessary */
> if (up->port.flags & UPF_CONS_FLOW) {
> tmout = 1000000;
> - while (--tmout &&
> - ((serial_in(up, UART_MSR) & UART_MSR_CTS) == 0))
> + while (!(serial_in(up, UART_MSR) & UART_MSR_CTS) && --tmout) {
> udelay(1);
> + if ((tmout % 1000) == 0)
> + touch_nmi_watchdog();
> + }
> }
> }
>
> _
>
> -
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/