Re: [PATCH 00/18] [ANNOUNCE] Dynamically created function based events

From: Mathieu Desnoyers
Date: Sat Feb 03 2018 - 12:04:07 EST


----- On Feb 2, 2018, at 6:04 PM, rostedt rostedt@xxxxxxxxxxx wrote:

> At Kernel Summit back in October, we tried to bring up trace markers, which
> would be nops within the kernel proper, that would allow modules to hook
> arbitrary trace events to them. The reaction to this proposal was less than
> favorable. We were told that we were trying to make a work around for a
> problem, and not solving it. The problem in our minds is the notion of a
> "stable trace event".
>
> There are maintainers that do not want trace events, or more trace events in
> their subsystems. This is due to the fact that trace events post an
> interface to user space, and this interface could become required by some
> tool. This may cause the trace event to become stable where it must not
> break the tool, and thus prevent the code from changing.
>
> Or, the trace event may just have to add padding for fields that tools
> may require. The "success" field of the sched_wakeup trace event is one such
> instance. There is no more "success" variable, but tools may fail if it were
> to go away, so a "1" is simply added to the trace event wasting ring buffer
> real estate.
>
> I talked with Linus about this, and he told me that we already have these
> markers in the kernel. They are from the mcount/__fentry__ used by function
> tracing. Have the trace events be created by these, and see if this will
> satisfy most areas that want trace events.

The approach proposed here will introduce an expectation that internal
function signatures never change in the kernel, else it would break user-space
tools hooking on those functions.

The instrumentation infrastructure provided by this patchset might be useful
for "one off" scripts, but it does not address the "stable instrumentation"
expectations issue.

The problem today is caused by widely used trace analysis tools that cannot
cope with changes to the kernel instrumentation, do not report the
instrumentation mismatch compared to their expectations, and we generally
don't expect users to ever update those tools to deal with newer kernels. Having
those tools hook on function names/arguments will not make this magically go
away. As soon as kernel code changes, widely used trace analysis tools will
start breaking left and right, and we will be back to square one. Only this time,
it's the internal function signature which will have become an ABI.

A possible solution to this problem appears if we start considering trace
analysis tools as just that: "tooling", with the following properties:

1) Tools need to validate that the instrumentation provided matches their
expectations. This can be done by checking event/field names and/or version.
Tools that fail to do that should be fixed.

2) Tools need to report to the user when the instrumentation does not match
their expectations, and hint users to upgrade in order to deal with change.

3) Tools need to be backward compatible with respect to instrumentation: a
user switching between older and newer kernels should not have to keep
various copies of their tooling stack (graphical UI, analysis scripts,
and so on).

If our goal is really to address this "stable instrumentation" issue, I don't
think hooking on functions helps in any way. I hope we can work on defining
instrumentation interface rules in order to deal with the fundamental problem
of requiring tooling to adapt to kernel changes.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com