[PATCH 0/3] [RFC] new nmi_watchdog using perf events

From: Don Zickus
Date: Wed Jan 27 2010 - 15:04:17 EST

This patch series tries to implement a new nmi_watchdog using the perf
events subsystem. I am posting this series early for feedback on the
approach. It isn't feature compatible with the old nmi_watchdog yet, nor
does it have all the old workarounds either.

The basic design is to create an in-kernel performance counter that goes off
every few seconds and checks for cpu lockups. It is fairly straight
forward. Some of the quirks are making sure the cpu lockup detection works

It has been lightly tested for now. Once people are ok with the approach,
I'll expand testing to more machines in our lab.

I tried taking a generic approach so all arches could use it if they want
and then implement some per arch specific hooks. I believe this is what
Ingo was suggesting. The initial work is based off of kernel/softlockup.c.

Any feedback would be great.


Sorry if this spams people again. My mailer was broke and some mail wasn't
sent correctly.

Don Zickus (3):
[RFC][x86] move notify_die from nmi.c to traps.c
[RFC] nmi_watchdog: new implementation using perf events
[RFC] nmi_watchdog: config option to enable new nmi_watchdog

arch/x86/kernel/apic/Makefile | 7 ++-
arch/x86/kernel/apic/hw_nmi.c | 114 ++++++++++++++++++++++++
arch/x86/kernel/apic/nmi.c | 7 --
arch/x86/kernel/traps.c | 7 ++
include/linux/nmi.h | 4 +
kernel/Makefile | 1 +
kernel/nmi_watchdog.c | 196 +++++++++++++++++++++++++++++++++++++++++
lib/Kconfig.debug | 13 +++
8 files changed, 341 insertions(+), 8 deletions(-)
create mode 100644 arch/x86/kernel/apic/hw_nmi.c
create mode 100644 kernel/nmi_watchdog.c

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/