> We still keep the CPU id because LKET still support ASCII tracing which > mixes the output of all the CPUs together. It is still debatable > whether this is a useful feature or not though. If we remove ASCII > event tracing from LKET, we could remove CPU id from the event header as > well.
>
How hard would it be to make LKET send its ASCII output to multiple "channels"
(buffers) and then fetch and combine them in user space ? Have a look at lttd
and lttv in the ltt-control package from the LTTng project : it would be
trivial to adapt. In fact, there is already a text dump module available.
> The tid we still include because LKET supports turning on individual > tracepoints unlike LTT, which if I remember correctly turns on all the > tracepoint that are compiled into the running kernel. Since the user is > free to chose which tracepoints he wants to use for his workload, we can > not guarantee that scheduler tracepoints are going to be available. We > consider taking the tid as one of those absolute minimum pieces of data > required to do meaningful analysis.Thats also a possible and it should not be difficult to implement.
>
I understand, but it does not have to be included in the bare-boned event
header. We could think of an optional "event context" header that would have its
individual parts enabled or not depending on the events recorded in the trace.
For instance :
With scheduler instrumentation activated :
Event Header | Variable data
Without scheduler instrumentation activated :
Event Header | PID | Variable data
The information about whether or not the optional event context is present in
the trace or not could be saved in the trace header.
This way, we could not add unnecessary data when it is not needed. And
furthermore, this is extensible for other event context information.
> We chose to control performance and trace output size by letting users > have control of number of tracepoint he can activate at any given time. > This is important to us since we plan to add many dynamic tracepoints to > different sub-systems (filesystem, device drivers, core kernel > facilities, etc...). Turning on all of these tracepoint at the same > time would slow down the system to much and change the performance > characteristics of the environment being studied.
Yes, I know that overhead is a big problem with dynamic instrumentation ;) I
think we can find a way to both have an optimal trace format while giving
a dynamic probe based tracer enough context when needed.
Look like the example you propose above could also apply to this as well. You could implement some sort of debug mode to the trace data that provides extra information useful for debugging the tool. If the information is really only useful when debugging the trace tool during development, wouldn't it make sense to have a way to disable debugging junk as needed?
> I understand. But if the size of each event is fixed, why would you > expect the data sizes that the tool reports in the trace header for each > event to change over the course of a trace. If the data on the per-CPU > buffers is serialized, a similar authentication could be done using the > timestamp by checking the timestamps of the events before and after the > current event, thus validating the current timestamp as well as the size > offset of the previous event. Just a thought.
>
Yes, but if there is a bug with the timestamp (time going backward because of
problematic event record serialization), it becomes harder to pinpoint the
source of the problem (if it is due to a bug in the variable data serialization
mechanism, a bug in the user space "unserialization" mechanism or a bug in event
serialization within the kernel). LTTng hasn't suffered of this kind of issue
for quite some time, but when under heavy development, those indicators of data
consistency have all proven their usefulness.