Re: [PATCH] perf, x86: try to handle unknown nmis with runningperfctrs

From: Frederic Weisbecker
Date: Fri Aug 13 2010 - 00:37:56 EST


On Wed, Aug 11, 2010 at 01:10:46PM +0200, Robert Richter wrote:
> On 10.08.10 22:44:55, Frederic Weisbecker wrote:
> > On Tue, Aug 10, 2010 at 04:48:56PM -0400, Don Zickus wrote:
> > > @@ -1200,7 +1200,7 @@ void perf_events_lapic_init(void)
> > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> > > }
> > >
> > > -static DEFINE_PER_CPU(unsigned int, perfctr_handled);
> > > +static DEFINE_PER_CPU(unsigned int, perfctr_skip);
>
> Yes, using perfctr_skip is better to understand ...
>
> > > @@ -1229,14 +1228,11 @@ perf_event_nmi_handler(struct notifier_block *self,
> > > * was handling a perfctr. Otherwise we pass it and
> > > * let the kernel handle the unknown nmi.
> > > *
> > > - * Note: this could be improved if we drop unknown
> > > - * NMIs only if we handled more than one perfctr in
> > > - * the previous NMI.
> > > */
> > > - this_nmi = percpu_read(irq_stat.__nmi_count);
> > > - prev_nmi = __get_cpu_var(perfctr_handled);
> > > - if (this_nmi == prev_nmi + 1)
> > > + if (__get_cpu_var(perfctr_skip)){
> > > + __get_cpu_var(perfctr_skip) -=1;
> > > return NOTIFY_STOP;
> > > + }
> > > return NOTIFY_DONE;
> > > default:
> > > return NOTIFY_DONE;
> > > @@ -1246,11 +1242,21 @@ perf_event_nmi_handler(struct notifier_block *self,
> > >
> > > apic_write(APIC_LVTPC, APIC_DM_NMI);
> > >
> > > - if (!x86_pmu.handle_irq(regs))
> > > + handled = x86_pmu.handle_irq(regs);
> > > + if (!handled)
> > > + /* not our NMI */
> > > return NOTIFY_DONE;
> > > -
> > > - /* handled */
> > > - __get_cpu_var(perfctr_handled) = percpu_read(irq_stat.__nmi_count);
> > > + else if (handled > 1)
> > > + /*
> > > + * More than one perfctr triggered. This could have
> > > + * caused a second NMI that we must now skip because
> > > + * we have already handled it. Remember it.
> > > + *
> > > + * NOTE: We have no way of knowing if a second NMI was
> > > + * actually triggered, so we may accidentally skip a valid
> > > + * unknown nmi later.
> > > + */
> > > + __get_cpu_var(perfctr_skip) +=1;
>
> ... but this will not work. You have to mark the *absolute* nmi number
> here. If you only raise a flag, the next unknown nmi will be dropped,
> every.



Isn't it what we want? Only the next unknown nmi gets dropped.




> Because, in between there could have been other nmis that
> stopped the chain and thus the 'unknown' path is not executed.



I'm not sure what you mean here. Are you thinking about a third
NMI source that triggers while we are still handling the first
NMI in the back to back sequence?



> The trick in my patch is that you *know*, which nmi you want to skip.


Well with the flag you also know which nmi you want to skip.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/