Re: [PATCH -v3] perf, x86: try to handle unknown nmis with runningperfctrs

From: Don Zickus
Date: Thu Aug 26 2010 - 17:14:50 EST


On Mon, Aug 23, 2010 at 10:53:39AM +0200, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@xxxxxxx> wrote:
>
> >
> > * Don Zickus <dzickus@xxxxxxxxxx> wrote:
> >
> > > I'll test tip later today to see if I can reproduce it.
> > >
> > > Cheers,
> > > Don
> > >
> > > Ingo Molnar <mingo@xxxxxxx> wrote:
> > >
> > > >
> > > >it's not working so well, i'm getting:
> > > >
> > > > Uhhuh. NMI received for unknown reason 00 on CPU 9.
> > > > Do you have a strange power saving mode enabled?
> > > > Dazed and confused, but trying to continue
> > > >
> > > >on a nehalem box, after a perf top and perf stat run.
> >
> > FYI, it does not trigger on an AMD box.
>
> Ok, to not hold up the perf/urgent flow i zapped these two commits for
> the time being:
>
> 4a31beb: perf, x86: Fix handle_irq return values
> 8e3e42b: perf, x86: Try to handle unknown nmis with an enabled PMU
>
> We can apply them if they take a form that dont introduce a different
> kind of (and more visible) regression.

So this patch fixes it, though I haven't convince myself why (perhaps
babysitting my 4 month old isn't helping :-))

The code now enters the loop and reprocesses the new status which properly
increments handled to 2 and thus the new logic takes care of it.

Cheers,
Don


diff --git a/arch/x86/kernel/cpu/perf_event_intel.c b/arch/x86/kernel/cpu/perf_event_intel.c
index 4539b4b..d16ebd8 100644
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -738,6 +738,7 @@ again:

inc_irq_stat(apic_perf_irqs);
ack = status;
+ intel_pmu_ack_status(ack);

intel_pmu_lbr_read();

@@ -766,8 +767,6 @@ again:
x86_pmu_stop(event);
}

- intel_pmu_ack_status(ack);
-
/*
* Repeat if there is more work to be done:
*/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/