Re: [RFC PATCH 5/5] perf: Implement perf_output_addr()

From: Steven Rostedt
Date: Wed May 19 2010 - 12:08:20 EST


On Wed, 2010-05-19 at 17:50 +0200, Peter Zijlstra wrote:
> On Wed, 2010-05-19 at 11:38 -0400, Steven Rostedt wrote:
>
> > > No, but suppose the tracepoint has a racy expression in it. Having to
> > > evaluate { assign; } multiple times could yield different results, which
> > > in turn means you have to run the filter multiple times too, etc..
> >
> > I'm still a bit confused by what you mean here. Could you show an
> > example?
>
> Well, suppose { assign; } contains:
>
> entry->foo = atomic_read(&bar);
>
> Now suppose you have multiple active consumers of the tracepoint, either
> you do the evaluation once and copy that around, or you do it multiple
> times and end up with different results.

OK, this is where I'm getting a bit lost. The "multiple active
consumers". Is this multiple instances of perf? Or perf doing multiple
things with that event using different buffers?

>
> > > Although I suppose you could delay the commit of the first even and copy
> > > from there into the next events, but that might give rather messy code.
> > >
> > > > Note, the shrinking of the TRACE_EVENT() code that I pushed (and I'm
> > > > hoping makes it to 35 since it lays the ground work for lots of features
> > > > on top of TRACE_EVENT()), allows you to pass private data to each probe
> > > > registered to the tracepoint. Letting the same function handle two
> > > > different activities, or different tracepoints.
> > >
> > > tracepoint_probe_register() is useless, it requires scheduling. I
> > > currently register a probe on pref_event creation and then maintain a
> > > per-cpu hlist of active events.
> >
> > When is perf_event creation? When the user runs the code or at boot up?
>
> sys_perf_counter_open()
>
> And an event could be per task, so it needs to be scheduled along with
> the task context, try doing that with probes ;-)

Ah, this is basically the same thing that ftrace does too. It only
enables the tracepoint (or function tracer) at initiation of the trace,
and uses things like a hash table to determine if the event (or
function) should be traced or not.

>
> > Hmm, could be, don't know for sure. I just want to keep the macro magic
> > to a minimum ;-)
>
> Right, but filters evaluated at the point where you basically already
> done all the hard work simply don't make much sense in my book.

Well, the hard work was just to reserve the buffer, which is under 100ns
to do. But we still need the assign, because the filters compare the
result of those assigns.

I guess you are saying that if we have a filter, we need to do the
assign to a temporary buffer, evaluate, and then decide if we should
record it (via copy) or not.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/