Re: [RFC PATCH 1/3] Unified trace buffer
From: Ingo Molnar
Date: Sat Sep 27 2008 - 13:50:51 EST
* Ingo Molnar <mingo@xxxxxxx> wrote:
> Historically we've been flip-flopping on that issue in ftrace, whether
> it should be coherent by default or not. We had at least three of four
> variations of global synchronization. (one was an atomic generation
> counter, another variant a global lock)
let me outline why that flip-flopping occured.
- coherent tracer: has built-in serialization of global events. If the
tracer shows events to be after each other, they were after each
other.
- incoherent tracer: events might be mixed up slightly on the micro
scale.
In your tree you'll see dozens of fixes from me in the past 10 years or
so where i used various tracers to find some bug. Some of them were done
with coherent tracers, some of them were done with incoherent tracers.
Here a few common patterns that influenced which kind of tracer i used:
- SMP races. There it's really important to see the ordering of events,
and coherent tracers (where the ordering of events as displayed by
the tracer can be trusted) are used by default
- EXCEPT: _very_ often an SMP race goes away if we add global
synchronization to trace events. Sometimes the pure delay can hide
races. So incoherent tracers are very important here.
- analysis of performance problems: here incoherent tracers win hands
down. It's important to see all events on all CPUs, but it's not at
all important to see the precise micro-ordering of events on a global
basis. We want to see rough workload behavior, how tasks iteract -
and most importantly, we want tracing to be as low-overhead as
possible.
- [ in many cases coherency does not matter because we only look at a
single CPU's or app's trace. ]
for example on an 16-way CPU, when i run a high-event-count workload and
add global serialization to events, the workload can easily be slower by
10% or more. Sometimes even the characteristics of the workload changes
due to having a globally synchronized tracer.
So ... neither coherent nor incoherent tracers are a clear, obvious
default.
IMO incoherent is the more useful default in terms of being able to find
bugs with it, because it has a higher utility factor. People have to
interpret traces anyway, and the main usecase of ftrace is that i ask a
tester to do a trace, and then i interpret it.
Coherent is the more fool-proof default - but less generally usable.
ftrace was coherent not so long ago - so we can certainly switch back to
that, as a default. But coherency must not be hardcoded into a multi-CPU
tracer.
Since the overhead and serialization skew shows up very quickly in
ftrace we do not serialize globally and use cpu_clock() and try to make
that accurate enough.
But in any case, cpu_clock()/sched_clock() is non-serialized, so it can
be used in coherent and non-coherent tracers just as well. So i dont see
the fundamental connection.
Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/