Re: 2.6.38-rc2: Uhhuh. NMI received for unknown reason 2d on CPU 0.
From: Dave Airlie
Date: Wed Feb 16 2011 - 21:56:10 EST
>
> It's appended below for your convenience. Are you using this
> unsuccessfully?
This patch quoted below fixes it for me.
No more spurious NMIs on my P4.
Tested-by: Dave Airlie <airlied@xxxxxxxxxx>
>
>
> From: Cyrill Gorcunov <gorcunov@xxxxxxxxxx>
> Subject: [PATCH] perf, x86: P4 PMU -- Fix unflagged overflows test
>
> A couple of people have reported an unknown NMI issue on p4 pmu.
> This patch should fix it.
>
> Reported-by: George Spelvin <linux@xxxxxxxxxxx>
> Reported-by: Meelis Roos <mroos@xxxxxxxx>
> Reported-by: Don Zickus <dzickus@xxxxxxxxxx>
> Signed-off-by: Cyrill Gorcunov <gorcunov@xxxxxxxxxx>
> CC: Ingo Molnar <mingo@xxxxxxx>
> CC: Lin Ming <ming.m.lin@xxxxxxxxx>
> CC: Don Zickus <dzickus@xxxxxxxxxx>
> CC: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> ---
> arch/x86/include/asm/perf_event_p4.h | 1 +
> arch/x86/kernel/cpu/perf_event_p4.c | 11 ++++++++---
> 2 files changed, 9 insertions(+), 3 deletions(-)
>
> Index: linux-2.6.tip/arch/x86/include/asm/perf_event_p4.h
> ===================================================================
> --- linux-2.6.tip.orig/arch/x86/include/asm/perf_event_p4.h
> +++ linux-2.6.tip/arch/x86/include/asm/perf_event_p4.h
> @@ -22,6 +22,7 @@
>
> #define ARCH_P4_CNTRVAL_BITS (40)
> #define ARCH_P4_CNTRVAL_MASK ((1ULL << ARCH_P4_CNTRVAL_BITS) - 1)
> +#define ARCH_P4_UNFLAGGED_BIT ((1ULL) << (ARCH_P4_CNTRVAL_BITS - 1))
>
> #define P4_ESCR_EVENT_MASK 0x7e000000U
> #define P4_ESCR_EVENT_SHIFT 25
> Index: linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
> ===================================================================
> --- linux-2.6.tip.orig/arch/x86/kernel/cpu/perf_event_p4.c
> +++ linux-2.6.tip/arch/x86/kernel/cpu/perf_event_p4.c
> @@ -770,9 +770,14 @@ static inline int p4_pmu_clear_cccr_ovf(
> return 1;
> }
>
> - /* it might be unflagged overflow */
> - rdmsrl(hwc->event_base + hwc->idx, v);
> - if (!(v & ARCH_P4_CNTRVAL_MASK))
> + /*
> + * at some circumstances the overflow might issue NMI but did
> + * not set P4_CCCR_OVF bit so since a counter holds a negative value
> + * we simply check for high bit being set, if it's cleared it means
> + * the counter has reached zero value and continued counting before
> + * real NMI signal was received
> + */
> + if (!(v & ARCH_P4_UNFLAGGED_BIT))
> return 1;
>
> return 0;
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/