Re: [PATCH 00/18] [ANNOUNCE] Dynamically created function based events

From: Steven Rostedt
Date: Sat Feb 03 2018 - 16:08:52 EST


On Sat, 3 Feb 2018 12:52:08 -0800
Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:

> On Sat, Feb 03, 2018 at 02:02:17PM -0500, Steven Rostedt wrote:
> >
> > From those that were asking about having "trace markers" (ie.
> > Facebook), they told us they can cope with kernel changes.
>
> There is some misunderstanding here.
> We never asked for this interface.

But you wanted trace markers? Just to confirm.

> We're perfectly fine with existing kprobe/tracepoint+bpf.

OK, so no new development in this was wanted? So the entire talk about
getting tracepoints into vfs and scheduling wasn't needed?

> There are plenty of things to improve there, but this 'function based events'
> is not something we're interested in.

OK, but when I was showing this interface in DevConf.cz, there appeared
to be a lot of interest for it.

> I don't see how they are any better than kprobes and suffer from the same issues.

One only needs to look at source code, to add these. You don't need to
know the specifics of a registers and such. That's a big +. Sure, we
could add this to kprobes as well. But this also doesn't need the
kprobe infrastructure.

> We really dislike text based interfaces since they are good only

Who exactly is "we"?

Peter Zijlstra told me it's basically the only interface he uses. He
doesn't care for tools on top.

> for human access and very awkward to use from tools.

trace-cmd builds its entire connection without issue via these
interfaces. What is awkward about it?

> We also dislike when kernel takes on challenge to do text language parsing.
> It's a user space job.

Not if you are working in the embedded space and only have busybox as
your interface.

>
> > The issue is that people are afraid to add tracepoints into their
> > subsystem because they are afraid that they will become stable and
> > limit their own development.
>
> This is not true. Tracepoints are being added and they're being changed.

vfs doesn't have any tracepoints. And Peter is reluctant to add any
more to the scheduler.

> We have a bunch of tools that use both kprobe and tracepoint hooks
> together with bpf programs to extract information from the kernel.
> They do break from time to time when we upgrade kernels (and we upgrade often),
> but keeping 'if kernel X do this, if kernel Y do that' inside the tool
> is perfectly fine.
> More often the tools have 'if kernel X ...' due to bpf verifier differences.
> It's constantly evolving and older kernels cannot load complex bpf
> programs written for the latest. So tools have to do some ugly workarounds.
>
> > One problem we are having today is that too many trace events are being
> > created, where there are a lot of them that have been used once and
> > never used again. And people don't care about them.
>
> I don't think such issue exists. Please point an example of
> a tracepoint that was added and no one cares about it.

I've already cleaned up several tracepoints that have no path to them.
I'd say those are ones people do not care about. I've also removed
several trace events that are not even connected anywhere. These take
up around 5k each of memory. And these are just the trace events that
don't have paths to them. If we have tracepoints that no longer have
paths to them (which I can detect), how many more have paths but people
don't care about?

-- Steve

>
> As far as Mathieu's point about detecting the difference between kernels,
> yes, it's a real problem to solve. The tracepoint changes are
> easy to detect, but changes to function arguments are much harder.
> Hence we're using kprobes on functions that are unlikely to change
> and that works well.
>
> Also please note that kernel tracepoints are not different from tracing tool
> point of view than USDT tracepoints in user space apps.
> The tools attach to both of them and expect both to be more or less
> stable. Yet kernel tracepoints and USDT in apps _do_ change
> and tools have to deal with changes. It's actually harder with USDT.