Re: Problem with perf hardware counters grouping

From: Peter Zijlstra
Date: Thu Sep 01 2011 - 12:42:23 EST


On Thu, 2011-09-01 at 11:21 -0400, Vince Weaver wrote:
> On Thu, 1 Sep 2011, Peter Zijlstra wrote:
>
> > What happens with your >3 case is that while the group is valid and
> > could fit on the PMU, it won't fit at runtime because the NMI watchdog
> > is taking one and won't budge (cpu-pinned counter have precedence over
> > any other kind), effectively starving your group of pmu runtime.

> UGH! I just noticed this problem yesterday and was meaning to track it
> down.
>
> This obviously causes PAPI to fail if you try to use the maximum number of
> counters. Instead of getting EINVAL at open time or even at start time,
> you just silently read all zeros at read time, and by then it's too late
> to do anything useful about the problem because you just missed measuring
> what you were trying to.
>
> Is there any good workaround, or do we have to fall back to trying to
> start/read/stop every proposed event set to make sure it's valid?

I guess my first question is going to be, how do you know what the
maximum number of counters is in the first place?


> This is going to seriously impact performance, and perf_event performance
> is pretty bad to begin with. The whole reason I was writing the tests to
> trigger this is because PAPI users are complaining that perf_event
> overhead is roughly twice that of perfctr or perfmon2, which I've verified
> experimentally.

Yeah, you keep saying this, where does it come from? Only the lack of
userspace rdpmc?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/