Re: [PATCH 3/3] [RFC] nmi_watchdog: config option to enable newnmi_watchdog

From: Ingo Molnar
Date: Tue Feb 02 2010 - 02:29:28 EST



* Don Zickus <dzickus@xxxxxxxxxx> wrote:

> On Fri, Jan 29, 2010 at 09:12:27AM +0100, Ingo Molnar wrote:
> >
> > * Don Zickus <dzickus@xxxxxxxxxx> wrote:
> >
> > > On Thu, Jan 28, 2010 at 03:54:54PM +0100, Peter Zijlstra wrote:
> > > > On Wed, 2010-01-27 at 15:03 -0500, Don Zickus wrote:
> > > > >
> > > > > These are the bits that enable the new nmi_watchdog and safely
> > > > > isolate the old nmi_watchdog. Only one or the other can run, not
> > > > > both at the same time.
> > > >
> > > > perf disables the lapic watchdog when it wants the pmu, so there
> > > > shouldn't be a problem having both built in.
> > >
> > > Yes it does disable but does not prevent nmi_watchdog_tick from running
> > > nor the /proc interface from being loaded. So perhaps my description
> > > isn't very good. The idea with the new watchdog was to re-use some of
> > > the bits of the old one, but having them both compiled in seemed to
> > > stomp on each other. That is what I was trying to prevent.
> > >
> > > I can certainly change the behaviour, just makes the code a little more
> > > messy I think.
> >
> > I think that's a good idea - and i think we want to be bold and just have
> > the new code run seemlessly. (and fix bugs, if any.)
>
> Ok. I guess I am confused what you are suggesting here, to do as Peter
> suggested and run both at the same time?

I dont think we want to run old and new code at once, the old NMI watchdog
code is really a hardcoded minimal PMU driver generating a cycles based NMI
tick once per second.

> > What do you think?
>
> I will need to give you an updated patch that properly sets the frequency
> of the NMI and I probably should still implement a code path that uses the
> software perf counters in the cases where the hardware perf counters are
> not available.
>
> It seems like you are ok with my approach. If that is so, I can test on
> more machines to iron out some more bugs. Or did you want to take my
> patches as is and have me throw fixes on top?

Well, all known bugs/showstoppers should be fixed - but otherwise if you
think it works fine we can certainly apply it and then iterate it from that
point on to increase coverage and add features.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/