Re: perf group read for inherited events

From: Peter Zijlstra
Date: Tue May 30 2017 - 14:53:17 EST


On Tue, May 30, 2017 at 01:55:33PM -0400, Vince Weaver wrote:
> So the issue is currently if you were sampling, and you were sampling on
> an event group, and you had set PERF_SAMPLE_READ to get all counts for a
> group, and the event was also inherited....

No, anything PERF_SAMPLE_READ (group or not) on inherited events is
wrong.

It would only report the event count of the current task count + all
dead child counts (if you hit the parent event). It would not include
the current count of any other live tasks in the hierarchy.

And the problem is that fixing this is rather tricky and iterating the
hierarchy can be excessively expensive in any case (imagine having to
iterate several hundred tasks tasks from that NMI/interrupt).

And since its from NMI/interrupt context it is impossible to get the
current count of any other live tasks that are running on another CPU.

> perf_event_open() would let
> you do this even though the results would probably be wrong?

Right, currently we have the filter on PERF_FORMAT_GROUP, but it should
be PERF_SAMPLE_READ.

So a !group SAMPLE_READ on inherited is currently allowed but returns
'interesting' values.