Re: [PATCH] perf, x86: catch spurious interrupts after disablingcounters

From: Cyrill Gorcunov
Date: Thu Sep 16 2010 - 02:53:33 EST


On Thu, Sep 16, 2010 at 12:10:41AM +0200, Robert Richter wrote:
> On 15.09.10 13:40:12, Cyrill Gorcunov wrote:
> > Yeah, already noted from your previous email. Perhaps we might
> > do a bit simplier approach then -- in nmi handler were we mark
> > "next nmi" we could take into account not "one next" nmi but
> > sum of handled counters minus one being just handled (of course
> > cleaning this counter if new "non spurious" nmi came in), can't
> > say I like this approach but just a thought.
>
> If we disable a counter, it might still trigger an interrupt which we
> cannot detect. Thus, if a running counter is deactivated, we must
> count it as handled in the nmi handler.
>
> Working with a sum is not possible, because a disabled counter may or
> *may not* trigger an interrupt. We cannot predict the number of
> counters that will be handled.
>
> Dealing with the "next nmi" is also not handy here. Spurious nmis are
> caused then stopping a counter. Since this is done outside the nmi
> handler, we would then start touching the "next nmi" also outside the
> handler. This might be more complex because we then have to deal with
> locking or atomic access. We shouldn't do that.
>
> -Robert
>

OK, I see what you mean Robert. Btw, when you reorder cpu_active_mask access
and wrmsr did you try also additional read after write of msr? ie like

wrmsr
barrier() // just to be sure gcc would not reorder it
rdmsr
clear cpu_active_mask

wonders if it did the trick

-- Cyrill
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/