Re: [RFD tracing] Tracing ABI Work Plan
From: David Sharp
Date: Fri Nov 12 2010 - 20:56:53 EST
On Thu, Nov 11, 2010 at 5:02 AM, Mathieu Desnoyers
<mathieu.desnoyers@xxxxxxxxxxxx> wrote:
> * Masami Hiramatsu (masami.hiramatsu.pt@xxxxxxxxxxx) wrote:
>> Hi,
>>
>> (2010/11/11 9:46), Mathieu Desnoyers wrote:
>> > A) New ABI for user-space
>> >
>> > This new ABI will provide the features long-awaited by tracing users. We've
>> > already started this discussion by listing the requirements and acknowledging
>> > them. It is now time to start discussing the ABI. Upon disagreement on answering
>> > a specific requirement, two questions will arise:
>> >
>> > 1. How much trouble it really is to take care of this requirement. If the answer
>> > Â Âis "not much", then we simply take care of it.
>> > 2. If it really is a big deal to take care of a requirement at the ABI level,
>> > Â Âthen we will have to discuss the use-cases.
>> >
>> > Once we are on the same page with respect to these requirements, we can come up
>> > with an ABI proposal for:
>> >
>> > - Tracing control
>> > - Trace format
>> >
>> >
>> > B) Internal Instrumentation API
>> >
>> > I propose to standardize the instrumentation mechanisms (Tracepoints, Function
>> > Tracer, Dynamic Probes, System Call Tracing, etc), so they can be used by
>> > Ftrace, Perf, and by the new ABI without requiring to build all three tracer
>> > ABI code-bases in the kernel. This involves modularizing the instrumentation
>> > sources, taking the following aspects into account:
>> >
>> > - They should be stand-alone objects, which can be built without a full tracer
>> > Â enabled.
>> > - They should offer a "registration/unregistration" API to tracers, so tracers
>> > Â can register a callback along with a private data pointer (this would fit
>> > Â with the "multiple concurrent tracing session" requirement).
>> > - They should call these callbacks passing the private data pointer when the
>> > Â instrumentation is hit.
>> > - They should provide a mechanism to list the available instrumentation (when it
>> > Â makes sense) and active instrumentation. E.g., it makes sense for tracepoints
>> > Â to list the available tracepoints, but it only makes sense for dynamic probes
>> > Â to list the enabled probes.
>> >
>> > Masami Hiramatsu and Frederic Weisbecker already showed interest in undertaking
>> > this task.
>>
>> Actually, I didn't talked about what API should be provided internally.
>> (Yeah, I know LTTng handler want that. However, there is no "external" handler
>> Â_inside_ linux kernel tree)
>
> My target here is not LTTng. My goal is to get the ball rolling for the improved
> ABI. If we make sure all instrumentation sources provide a clean API to Ftrace,
> Perf, and eventually the new ABI, then it makes it easier to transition from one
> ABI to another; we would not have to change the "whole world", but rather just
> to switch to the new ABI when it is deemed ready.
>
>> Instead, I and Frederic talked shortly about something like user interface
>> for events. (so it would be more close to A, about controlling)
>
> Yep, this too makes sense.
>
>> As Thomas said, eventually kernel internal tracer should simply provide
>> "events tracing" functionality. User tools will analyze it and it's not
>> kernel's business. I agree with his opinion.
>
> Right.
>
>> From above viewpoint, currently only trace-events(tracepoint-based events)
>> and dynamic-events (kprobe-based events) are providing same interface for
>> users. And, for example, perf's PMU events or ftrace's mcount events aren't
>> shown up under debugfs/tracing/events. IMHO, all events provided by kernel
>> should be there, so that user tools can read the format and control those
>> events same way.
>
> We should decide if we keep this stuff under /debugfs or move it elsewhere. This
> is part of the ABI after all. Independently of where this ends up, the
> operations we need to perform are:
>
> - For each instrumentation source (tracepoints, function tracing, dynamic
> Âprobes, PMC, ...)
> Â- List available instrumentation
> Â Â- Makes sense for tracepoints and PMC, but function graph tracer and dynamic
> Â Â Âprobes might skip this part.
> Â- List activated instrumentation
> Â- Control interface
> Â Â- Activate/deactivate instrumentation, on a per trace session basis
> Â Â Â- Note: the concept of "trace session" is currently inexisting in both
> Â Â Â Âperf and ftrace. We'll have to think of something in terms of ABI here.
> Â Â Â- Note2: each instrumentation source will expects its own sets of
> Â Â Â Âparameters to specify the instrumentation to control.
> Â Â Â- Note3: Handling of instrumentation in dynamically loadable modules
> Â Â Â Â(which applies also to dynamic probes) might require that we allow the
> Â Â Â Âcontrol interface to activate a tracepoint or dynamic probe for a trace
> Â Â Â Âsession (e.g. by name) before the instrumentation point is listed as
> Â Â Â Âavailable instrumentation. The goal is to deal with modules dynamically
> Â Â Â Âloaded and dynamic instrumentation dynamically added while the trace is
> Â Â Â Âbeing recorded; without requiring any user knowledge about
> Â Â Â Âmodule-specific parameters whatsoever.
>
>> For this purpose, I'd like to expand trace-event/dynamic-event framework to
>> those events. It seems that some PMU events can be treated as trace-events,
>> mcount and other parametric events can be treated as dynamic-events.
>>
>> Anyway, those stuffs can be done without new-ring-buffer-ABI things.
>> I'll just expand dyn-events a bit far from here :-)
>
> Steven wanted to clean up his debugfs event description files, so this would fit
> well with this effort, and is indeed an ABI change. One way to do it is to keep
> the old files around and come up with a new hierarchy for the "cleaned up"
> files, along with the new features you want to add.
>
> Also, we might want to consider moving the debugfs event description files to a
> slightly different format (see my metadata proposal). It expands a bit on the
> current information, and allows us to deal with bitfields much more elegantly.
> However this is also an ABI change.
Along this vein, we'd like to see a version number somewhere in the
interface. Mostly, this should version the ring buffer data headers,
event description format (not content), and control file interface
(enable, filter, etc). I think the text format that comes out of the
"trace" file doesn't necessarily need to be versioned. A simple
major.minor string would be fine.
>
> Thanks,
>
> Mathieu
>
>>
>> Best Regards,
>>
>> --
>> Masami HIRAMATSU
>> 2nd Dept. Linux Technology Center
>> Hitachi, Ltd., Systems Development Laboratory
>> E-mail: masami.hiramatsu.pt@xxxxxxxxxxx
>
> --
> Mathieu Desnoyers
> Operating System Efficiency R&D Consultant
> EfficiOS Inc.
> http://www.efficios.com
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/