Re: [PATCH v2] sched/debug: Add new tracepoint to track cpu_capacity

From: Alexei Starovoitov
Date: Tue Jan 05 2021 - 11:44:56 EST


On Tue, Jan 5, 2021 at 3:39 AM Qais Yousef <qais.yousef@xxxxxxx> wrote:
>
> On 01/04/21 10:59, Alexei Starovoitov wrote:
> > > > > I did have a patch that allowed that. It might be worth trying to upstream it.
> > > > > It just required a new macro which could be problematic.
> > > > >
> > > > > https://github.com/qais-yousef/linux/commit/fb9fea29edb8af327e6b2bf3bc41469a8e66df8b
> > > > >
> > > > > With the above I could attach using bpf::RAW_TRACEPOINT mechanism.
> > > > >
> > > >
> > > > Yeah, that could work. I meant there was no way to do it with what was there :)
> > > >
> > > > In our initial attempts at using BPF to get at nr_running (which I was not
> > > > involved in and don't have all the details...) there were issues being able to
> > > > keep up and losing events. That may have been an implementation issue, but
> > > > using the module and trace-cmd doesn't have that problem. Hopefully you don't
> > > > see that using RAW_TRACEPOINTs.
> > >
> > > So I have a proper patch for that now, that actually turned out to be really
> > > tiny once you untangle exactly what is missing.
> > >
> > > Peter, bpf programs aren't considered ABIs AFAIK, do you have concerns about
> > > that?
> > >
> > > Thanks
> > >
> > > --
> > > Qais Yousef
> > >
> > > -->8--
> > >
> > > From cf81de8c7db03d62730939aa902579339e2fc859 Mon Sep 17 00:00:00 2001
> > > From: Qais Yousef <qais.yousef@xxxxxxx>
> > > Date: Wed, 30 Dec 2020 22:20:34 +0000
> > > Subject: [PATCH] trace: bpf: Allow bpf to attach to bare tracepoints
> > >
> > > Some subsystems only have bare tracepoints (a tracepoint with no
> > > associated trace event) to avoid the problem of trace events being an
> > > ABI that can't be changed.
> > >
> > > From bpf presepective, bare tracepoints are what it calls
> > > RAW_TRACEPOINT().
> > >
> > > Since bpf assumed there's 1:1 mapping, it relied on hooking to
> > > DEFINE_EVENT() macro to create bpf mapping of the tracepoints. Since
> > > bare tracepoints use DECLARE_TRACE() to create the tracepoint, bpf had
> > > no knowledge about their existence.
> > >
> > > By teaching bpf_probe.h to parse DECLARE_TRACE() in a similar fashion to
> > > DEFINE_EVENT(), bpf can find and attach to the new raw tracepoints.
> > >
> > > Enabling that comes with the contract that changes to raw tracepoints
> > > don't constitute a regression if they break existing bpf programs.
> > > We need the ability to continue to morph and modify these raw
> > > tracepoints without worrying about any ABI.
> > >
> > > Signed-off-by: Qais Yousef <qais.yousef@xxxxxxx>
> > > ---
> > > include/trace/bpf_probe.h | 12 ++++++++++--
> > > 1 file changed, 10 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/include/trace/bpf_probe.h b/include/trace/bpf_probe.h
> > > index cd74bffed5c6..a23be89119aa 100644
> > > --- a/include/trace/bpf_probe.h
> > > +++ b/include/trace/bpf_probe.h
> > > @@ -55,8 +55,7 @@
> > > /* tracepoints with more than 12 arguments will hit build error */
> > > #define CAST_TO_U64(...) CONCATENATE(__CAST, COUNT_ARGS(__VA_ARGS__))(__VA_ARGS__)
> > >
> > > -#undef DECLARE_EVENT_CLASS
> > > -#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
> > > +#define __BPF_DECLARE_TRACE(call, proto, args) \
> > > static notrace void \
> > > __bpf_trace_##call(void *__data, proto) \
> > > { \
> > > @@ -64,6 +63,10 @@ __bpf_trace_##call(void *__data, proto) \
> > > CONCATENATE(bpf_trace_run, COUNT_ARGS(args))(prog, CAST_TO_U64(args)); \
> > > }
> > >
> > > +#undef DECLARE_EVENT_CLASS
> > > +#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
> > > + __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args))
> > > +
> > > /*
> > > * This part is compiled out, it is only here as a build time check
> > > * to make sure that if the tracepoint handling changes, the
> > > @@ -111,6 +114,11 @@ __DEFINE_EVENT(template, call, PARAMS(proto), PARAMS(args), size)
> > > #define DEFINE_EVENT_PRINT(template, name, proto, args, print) \
> > > DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
> > >
> > > +#undef DECLARE_TRACE
> > > +#define DECLARE_TRACE(call, proto, args) \
> > > + __BPF_DECLARE_TRACE(call, PARAMS(proto), PARAMS(args)) \
> > > + __DEFINE_EVENT(call, call, PARAMS(proto), PARAMS(args), 0)
> > > +
> > > #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
> >
> > The patch looks fine to me.
> > Please add a few things:
> > - selftests to make sure it gets routinely tested with bpf CI.
>
> Any pointer to an example test I could base this on?

selftests/bpf/

> > - add a doc with contents from commit log.
>
> You're referring to the ABI part of the changelog, right?
>
> > The "Does bpf make things into an abi ?" question keeps coming back
> > over and over again.
> > Everytime we have the same answer that No, bpf cannot bake things into abi.
> > I think once it's spelled out somewhere in Documentation/ it would be easier to
> > repeat this message.
>
> How about a new Documentation/bpf/ABI.rst? I can write something up initially
> for us to discuss in detail when I post.

There is Documentation/bpf/bpf_design_QA.rst
and we already have this text in there that was added back in 2017:

Q: Does BPF have a stable ABI?
------------------------------
A: YES. BPF instructions, arguments to BPF programs, set of helper
functions and their arguments, recognized return codes are all part
of ABI. However there is one specific exception to tracing programs
which are using helpers like bpf_probe_read() to walk kernel internal
data structures and compile with kernel internal headers. Both of these
kernel internals are subject to change and can break with newer kernels
such that the program needs to be adapted accordingly.

I'm suggesting to add an additional section to this Q/A doc to include
more or less
the same text you had in the commit log.