Re: [PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary
From: Namhyung Kim
Date: Wed Mar 16 2016 - 20:25:37 EST
On Wed, Mar 16, 2016 at 03:34:33PM +0100, Jiri Olsa wrote:
> Currently dynamic ftrace calls are updated any time
> the ftrace_ops is un/registered. If we do this update
> only when it's needed, we save lot of time for perf
> system wide ftrace function sampling/counting.
>
> The reason is that for system wide sampling/counting,
> perf creates event for each cpu in the system.
>
> Each event then registers separate copy of ftrace_ops,
> which ends up in FTRACE_UPDATE_CALLS updates. On servers
> with many cpus that means serious stall (240 cpus server):
>
> Counting:
> # time ./perf stat -e ftrace:function -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 370,663 ftrace:function
>
> 1.401427505 seconds time elapsed
>
> real 3m51.743s
> user 0m0.023s
> sys 3m48.569s
>
> Sampling:
> # time ./perf record -e ftrace:function -a sleep 1
> [ perf record: Woken up 0 times to write data ]
> Warning:
> Processed 141200 events and lost 5 chunks!
>
> [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]
>
> real 2m31.429s
> user 0m0.213s
> sys 2m29.494s
>
> There's no reason to do the FTRACE_UPDATE_CALLS update
> for each event in perf case, because all the ftrace_ops
> always share the same filter, so the updated calls are
> always the same.
>
> It's required that only first ftrace_ops registration
> does the FTRACE_UPDATE_CALLS update (also sometimes
> the second if the first one used the trampoline), but
> the rest can be only cheaply linked into the ftrace_ops
> list.
>
> Counting:
> # time ./perf stat -e ftrace:function -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 398,571 ftrace:function
>
> 1.377503733 seconds time elapsed
>
> real 0m2.787s
> user 0m0.005s
> sys 0m1.883s
>
> Sampling:
> # time ./perf record -e ftrace:function -a sleep 1
> [ perf record: Woken up 0 times to write data ]
> Warning:
> Processed 261730 events and lost 9 chunks!
>
> [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]
>
> real 1m31.948s
> user 0m0.309s
> sys 1m32.051s
>
> Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
Acked-by: Namhyung Kim <namhyung@xxxxxxxxxx>
Thanks,
Namhyung
> ---
> kernel/trace/ftrace.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 3a9a12215b50..22ef6fc597a2 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -2651,7 +2651,6 @@ static int ftrace_startup(struct ftrace_ops *ops, int command)
> return ret;
>
> ftrace_start_up++;
> - command |= FTRACE_UPDATE_CALLS;
>
> /*
> * Note that ftrace probes uses this to start up
> @@ -2672,7 +2671,8 @@ static int ftrace_startup(struct ftrace_ops *ops, int command)
> return ret;
> }
>
> - ftrace_hash_rec_enable(ops, 1);
> + if (ftrace_hash_rec_enable(ops, 1))
> + command |= FTRACE_UPDATE_CALLS;
>
> ftrace_startup_enable(command);
>
> @@ -2702,11 +2702,11 @@ static int ftrace_shutdown(struct ftrace_ops *ops, int command)
>
> /* Disabling ipmodify never fails */
> ftrace_hash_ipmodify_disable(ops);
> - ftrace_hash_rec_disable(ops, 1);
>
> - ops->flags &= ~FTRACE_OPS_FL_ENABLED;
> + if (ftrace_hash_rec_disable(ops, 1))
> + command |= FTRACE_UPDATE_CALLS;
>
> - command |= FTRACE_UPDATE_CALLS;
> + ops->flags &= ~FTRACE_OPS_FL_ENABLED;
>
> if (saved_ftrace_func != ftrace_trace_function) {
> saved_ftrace_func = ftrace_trace_function;
> --
> 2.4.3
>