[PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary

From: Jiri Olsa
Date: Wed Mar 09 2016 - 15:48:04 EST


Currently dynamic ftrace calls are updated any time
the ftrace_ops is un/registered. If we do this update
only when it's needed, we save lot of time for perf
system wide ftrace function sampling/counting.

The reason is that for system wide sampling/counting,
perf creates event for each cpu in the system.

Each event then registers separate copy of ftrace_ops,
which ends up in FTRACE_UPDATE_CALLS updates. On servers
with many cpus that means serious stall (240 cpus server):

Counting:
# time ./perf stat -e ftrace:function -a sleep 1

Performance counter stats for 'system wide':

370,663 ftrace:function

1.401427505 seconds time elapsed

real 3m51.743s
user 0m0.023s
sys 3m48.569s

Sampling:
# time ./perf record -e ftrace:function -a sleep 1
[ perf record: Woken up 0 times to write data ]
Warning:
Processed 141200 events and lost 5 chunks!

[ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]

real 2m31.429s
user 0m0.213s
sys 2m29.494s

There's no reason to do the FTRACE_UPDATE_CALLS update
for each event in perf case, because all the ftrace_ops
always share the same filter, so the updated calls are
always the same.

It's required that only first ftrace_ops registration
does the FTRACE_UPDATE_CALLS update (also sometimes
the second if the first one used the trampoline), but
the rest can be only cheaply linked into the ftrace_ops
list.

Counting:
# time ./perf stat -e ftrace:function -a sleep 1

Performance counter stats for 'system wide':

398,571 ftrace:function

1.377503733 seconds time elapsed

real 0m2.787s
user 0m0.005s
sys 0m1.883s

Sampling:
# time ./perf record -e ftrace:function -a sleep 1
[ perf record: Woken up 0 times to write data ]
Warning:
Processed 261730 events and lost 9 chunks!

[ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]

real 1m31.948s
user 0m0.309s
sys 1m32.051s

Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
---
kernel/trace/ftrace.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 123dddc660e9..48b491463549 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2650,7 +2650,6 @@ static int ftrace_startup(struct ftrace_ops *ops, int command)
return ret;

ftrace_start_up++;
- command |= FTRACE_UPDATE_CALLS;

/*
* Note that ftrace probes uses this to start up
@@ -2671,7 +2670,8 @@ static int ftrace_startup(struct ftrace_ops *ops, int command)
return ret;
}

- ftrace_hash_rec_enable(ops, 1);
+ if (ftrace_hash_rec_enable(ops, 1))
+ command |= FTRACE_UPDATE_CALLS;

ftrace_startup_enable(command);

@@ -2701,11 +2701,11 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)

/* Disabling ipmodify never fails */
ftrace_hash_ipmodify_disable(ops);
- ftrace_hash_rec_disable(ops, 1);

- ops->flags &= ~FTRACE_OPS_FL_ENABLED;
+ if (ftrace_hash_rec_disable(ops, 1))
+ command |= FTRACE_UPDATE_CALLS;

- command |= FTRACE_UPDATE_CALLS;
+ ops->flags &= ~FTRACE_OPS_FL_ENABLED;

if (saved_ftrace_func != ftrace_trace_function) {
saved_ftrace_func = ftrace_trace_function;
--
2.4.3