Re: NMI watchdog + NOHZ question

From: David Miller
Date: Mon Jun 22 2009 - 05:28:04 EST

From: Andi Kleen <andi@xxxxxxxxxxxxxx>
Date: Mon, 22 Jun 2009 10:18:50 +0200

> David Miller <davem@xxxxxxxxxxxxx> writes:
>> Is there something fundamental that should be preventing this?
> Unless that changed recently when I wasn't looking NOHZ should only
> stop timers when the CPU is idle. So when a driver is doing
> something and the interrupts are not disabled for too long the timers
> should be ticking.
> Then when you're idle interrupts should be never off, so the NMI
> watchdog cannot fire. On x86 often the NMI watchdog is in fact
> stopped on idle.

Thanks Andi.

I think something else is afoot, because while using "nohz=off" makes
the problem go away, simply adding a NMI watchdog touch after the
schedule() call in cpu_idle() does not make the problem go away.

Also, the cpu that gets the NMI watchdog is different from the cpu
running the qla2xxx driver init. That basically destroys the bulk
of my theory :-)
