ftrace global trace_pipe_raw

From: Claudio
Date: Tue Jul 24 2018 - 05:58:27 EST


Hello Steven,

I am doing correlation of linux sched events, following all tasks between cpus,
and one thing that would be really convenient would be to have a global
trace_pipe_raw, in addition to the per-cpu ones, with already sorted events.

I would imagine the core functionality is already available, since trace_pipe
in the tracing directory already shows all events regardless of CPU, and so
it would be a matter of doing the same for trace_pipe_raw.

But is there a good reason why trace_pipe_raw is available only per-cpu?

Would work in the direction of adding a global trace_pipe_raw be considered
for inclusion?

Thank you,

Claudio

On 07/09/2018 05:32 PM, Steven Rostedt wrote:
> On Fri, 6 Jul 2018 08:22:01 +0200
> Claudio <claudio.fontana@xxxxxxxxx> wrote:
>
>> Hello all,
>>
>> I have been experimenting with the idea of leaving ftrace enabled, with sched events,
>> on production systems.
>>
>> The main concern that I am having at the moment is about the impact on the system.
>> Enabling the sched events that I currently need for the tracing application
>> seems to slow down context-switches considerably, and make the system less responsive.
>>
>> I have tested with cyclictest on the mainline kernel, and noticed an increase of min, avg latencies of around 25%.
>>
>> Is this expected?
>>
>> Some initial investigation into ftrace seems to point at the reservation and commit of the events into the ring buffer
>> as the highest sources of overhead, while event parameters copying, including COMM, does not seem to have any noticeable effect
>> relative to those costs.
>>
>> I have been running 20 times the following test, and thrown away the first results:
>>
>> $ sudo ./cyclictest --smp -p95 -m -s -N -l 100000 -q
>
> OK, I just noticed that you are using -N which means all numbers are in
> nanoseconds.
>
>>
>> $ uname -a
>> Linux claudio-HP-ProBook-470-G5 4.18.0-rc3+ #3 SMP Tue Jul 3 15:50:30 CEST 2018 x86_64 x86_64 x86_64 GNU/Linux
>>
>> For brevity, this is a comparison of one test's results. All other test results show the same ~25% increase.
>>
>> On the left side, the run without ftrace sched events, on the right side with ftrace sched events enabled.
>>
>> CPU Count Min Act Avg Max Count Min-ftrace Act-ftrace Avg-ftrace Max-ftrace
>> 0 100000 2339 2936 2841 139478 100000 2900 3182 3566 93056
>> 1 66742 2365 3386 2874 93639 66750 2959 3786 3646 154074
>> 2 50080 2376 3058 2910 196221 50097 2997 4209 3655 18707
>> 3 40076 2394 3461 2931 17914 40091 3006 4417 3750 17159
>> 4 33404 2371 3612 2834 15336 33419 2997 3836 3594 23172
>> 5 28635 2387 3313 2885 25863 28649 2995 3795 3647 9956
>> 6 25058 2384 3428 2968 12162 25071 3051 4366 3719 18151
>> 7 22275 2381 2859 2982 10706 22287 3046 5078 3825 10781
>>
>> I would be thankful for any advice or comments on this,
>> especially with the goal in mind to lower as much as possible the runtime impact on the system.
>
> Thus, the tracing is causing the wakeup time to be an average of 0.8us
> longer.
>
> Yes that is expected.
>
> -- Steve
>