Re: [regression 2.6.39-rc2][bisected] "perf, x86: P4 PMU - Readproper MSR register to catch" and NMIs

From: Ingo Molnar
Date: Thu Apr 14 2011 - 04:05:50 EST



* Cyrill Gorcunov <gorcunov@xxxxxxxxxx> wrote:

> On Thu, Apr 14, 2011 at 10:47 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
> >
> > * Cyrill Gorcunov <gorcunov@xxxxxxxxxx> wrote:
> >
> >> -     apic_write(APIC_LVTPC, APIC_DM_NMI);
> >>
> >>       handled = x86_pmu.handle_irq(args->regs);
> >>       if (!handled)
> >>               return NOTIFY_DONE;
> >>
> >> +     /*
> >> +      * Unmasking should be done after IRQ handled, otherwise
> >> +      * there is a race between clearing of counter overflow
> >> +      * flag and LTV entry unmasking (which might lead to double
> >> +      * NMIs generation).
> >> +      */
> >> +     apic_write(APIC_LVTPC, APIC_DM_NMI);
> >
> > Here we could leak a masked IRQ through the !handled path. If we got a LVTPC
> > irq we better handle it and unmask the LVTPC unconditionally - regardless of
> > whether we consider it 'handled' or not from the kernel POV ...
> >
> > Thanks,
> >
> >        Ingo
>
> If there is no counters overflowed I believe we should not poke LVTPC until
> we sure NMI comes from it (and counter overflow is the only sign that NMI
> came from LVTPC as far as I may say, and I see also a possibility for race if
> counter signal reaches LVTPC and it is being processed inside apic chip
> {which might take some time too before real NMI signal appears in cpu} and as
> result hard to tell what we get in output -- double nmi again or something
> else).

Well, we unmasked unconditionally before. If we unmask conditionally now, we
risk not unmasking. We risk a completely stuck PMU (there wont ever come *any*
NMI from it if we ever forget to unmask) versus spurious NMIs.

Maybe we can do it - but it will need a lot of testing on a lot of CPU types to
make sure there's no other CPU quirks in this area ...

So unless the conditional unmasking fixes a real bug (in kgdb or elsewhere)
lets unmask unconditionally now to fix the P4 regression in .39 - and queue up
a *separate* patch that moves it even further down and makes it conditional -
but queue that up for .40.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/