On Fri, 2013-07-19 at 09:43 +0200, Ingo Molnar wrote:* Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
Hello.
The patches are the same, I only tried to update the changelogs a bit.
I am also quoting my old email below, to explain what this hack tries
to do.
Say, "perf record -e sched:sched_switch -p1".
Every task except /sbin/init will do perf_trace_sched_switch() and
perf_trace_buf_prepare() + perf_trace_buf_submit for no reason(),
it doesn't have a counter.
So it makes sense to add the fast-path check at the start of
perf_trace_##call(),
if (hlist_empty(event_call->perf_events))
return;
The problem is, we should not do this if __task != NULL (iow, if
DECLARE_EVENT_CLASS() uses __perf_task()), perf_tp_event() has the
additional code for this case.
So we should do
if (!__task && hlist_empty(event_call->perf_events))
return;
But __task is changed by "{ assign; }" block right before
perf_trace_buf_submit(). Too late for the fast-path check,
we already called perf_trace_buf_prepare/fetch_regs.
So. After 2/3 __perf_task() (and __perf_count/addr) is called
when ftrace_get_offsets_##call(args) evaluates the arguments,
and we can check !__task && hlist_empty() right after that.
Oleg.
Nice improvement.
Peter, Steve, any objections?
Yep, agreed.
The whole series...
Reviewed-and-Acked-by: Steven Rostedt <rostedt@xxxxxxxxxxx>