Re: [PATCH v2] sched/debug: Add new tracepoint to track cpu_capacity

From: Qais Yousef
Date: Fri Aug 28 2020 - 13:27:04 EST


On 08/28/20 19:10, Dietmar Eggemann wrote:
> On 28/08/2020 12:27, Qais Yousef wrote:
> > On 08/28/20 10:00, vincent.donnefort@xxxxxxx wrote:
> >> From: Vincent Donnefort <vincent.donnefort@xxxxxxx>
> >>
> >> rq->cpu_capacity is a key element in several scheduler parts, such as EAS
> >> task placement and load balancing. Tracking this value enables testing
> >> and/or debugging by a toolkit.
> >>
> >> Signed-off-by: Vincent Donnefort <vincent.donnefort@xxxxxxx>
> >>
> >> diff --git a/include/linux/sched.h b/include/linux/sched.h
> >
> > [...]
> >
> >> +int sched_trace_rq_cpu_capacity(struct rq *rq)
> >> +{
> >> + return rq ?
> >> +#ifdef CONFIG_SMP
> >> + rq->cpu_capacity
> >> +#else
> >> + SCHED_CAPACITY_SCALE
> >> +#endif
> >> + : -1;
> >> +}
> >> +EXPORT_SYMBOL_GPL(sched_trace_rq_cpu_capacity);
> >> +
> >
> > The placement of this #ifdef looks odd to me. But FWIW
> >
> > Reviewed-by: Qais Yousef <qais.yousef@xxxxxxx>
>
> Returning -1 for cpu_capacity? It makes sense for sched_trace_rq_cpu()
> but for cpu_capacity?

If rq is NULL you return -1, an error the way I read it. rq is passed as an
argument, so better ensure we handle NULL and not blindly dereference rq and
crash.

>
> Can you remind me why we have all these helper functions like
> sched_trace_rq_cpu_capacity?

struct rq is defined in kernel/sched/sched.h. It's not exported. Exporting
these helper functions was the agreement to help modules trace internal info.
By passing generic info you decouple the tracepoint from giving specific info
and allow the modules to extract all the info they need from the same
tracepoint. IE: if you need more than just cpu_capacity from this tracepoint,
you can get that without having to continuously add extra arguments everytime
you need an extra piece of info. Unless this info is not in the rq of course.

>
> In case we would let the extra code (which transforms trace points into
> trace events) know the internals of struct rq we could handle those
> things in the TRACE_EVENT and/or the register_trace_##name(void
> (*probe)(data_proto), void *data) thing.
> We always said when the internal things will change this extra code will
> break. So that's not an issue.

The problem is that you need to export struct rq in a public header. Which we
don't want to do. I have been trying to find out how to use BTF so we can
remove these functions. Haven't gotten far away yet - but it should be doable
and it's a question of me finding enough time to understand what was currently
done and if I can re-use something or need to come up with extra infrastructure
first.

Thanks

--
Qais Yousef