Re: [PATCH 5/5] ftrace: Update dynamic ftrace calls only if necessary
From: Steven Rostedt
Date: Fri Mar 18 2016 - 10:27:35 EST
Patches 4 and 5 seem agnostic to patches 1-3. I'll pull 4,5 into my
tree and 1-3 can go though tip. I'd like to run theses through my tests.
OK?
-- Steve
On Wed, 16 Mar 2016 15:34:33 +0100
Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
> Currently dynamic ftrace calls are updated any time
> the ftrace_ops is un/registered. If we do this update
> only when it's needed, we save lot of time for perf
> system wide ftrace function sampling/counting.
>
> The reason is that for system wide sampling/counting,
> perf creates event for each cpu in the system.
>
> Each event then registers separate copy of ftrace_ops,
> which ends up in FTRACE_UPDATE_CALLS updates. On servers
> with many cpus that means serious stall (240 cpus server):
>
> Counting:
> # time ./perf stat -e ftrace:function -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 370,663 ftrace:function
>
> 1.401427505 seconds time elapsed
>
> real 3m51.743s
> user 0m0.023s
> sys 3m48.569s
>
> Sampling:
> # time ./perf record -e ftrace:function -a sleep 1
> [ perf record: Woken up 0 times to write data ]
> Warning:
> Processed 141200 events and lost 5 chunks!
>
> [ perf record: Captured and wrote 10.703 MB perf.data (135950 samples) ]
>
> real 2m31.429s
> user 0m0.213s
> sys 2m29.494s
>
> There's no reason to do the FTRACE_UPDATE_CALLS update
> for each event in perf case, because all the ftrace_ops
> always share the same filter, so the updated calls are
> always the same.
>
> It's required that only first ftrace_ops registration
> does the FTRACE_UPDATE_CALLS update (also sometimes
> the second if the first one used the trampoline), but
> the rest can be only cheaply linked into the ftrace_ops
> list.
>
> Counting:
> # time ./perf stat -e ftrace:function -a sleep 1
>
> Performance counter stats for 'system wide':
>
> 398,571 ftrace:function
>
> 1.377503733 seconds time elapsed
>
> real 0m2.787s
> user 0m0.005s
> sys 0m1.883s
>
> Sampling:
> # time ./perf record -e ftrace:function -a sleep 1
> [ perf record: Woken up 0 times to write data ]
> Warning:
> Processed 261730 events and lost 9 chunks!
>
> [ perf record: Captured and wrote 19.907 MB perf.data (256293 samples) ]
>
> real 1m31.948s
> user 0m0.309s
> sys 1m32.051s
>
> Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
> ---