Re: [PATCH tip 0/3] Improvements of scheduler related Tracepoints

From: Alexei Starovoitov
Date: Thu Dec 14 2017 - 22:16:55 EST


On 12/14/17 12:49 PM, Peter Zijlstra wrote:
On Thu, Dec 14, 2017 at 12:20:41PM -0800, Teng Qin wrote:
This set of commits attempts to improve three scheduler related
Tracepoints: sched_switch, sched_process_fork, sched_process_exit.

Firstly, these commit add additional flag values, namely preempt,
clone_flags and group_dead to these Tracepoints, to make information
exposed via the Tracepoints more useful and complete.

Secondly, these commits exposes task_struct pointers in these
Tracepoints. The task_struct pointers are arguments of the Tracepoints
and currently only used to compute struct field values. But for BPF
programs attached to these Tracepoints, we may want to read additional
task information via the task_struct pointers. This is currently either
impossible, or we have to make assumption of whether the Tracepoint is
running from previous / parent or next / child, and use current pointer
instead. Exposing the task_struct pointers explicitly makes such use
case easier and more reliable.


NAK

not sure what is the concern here.
Is it first or second part of the above ?
preempt and group_dead are bool and clone_flags has uapi defined
flags, so no kernel internals being exposed.
Two task_struct pointers are unusable outside of bpf progs.
There are plenty of other tracepoints that store pointers to
kernel structs and bpf progs are looking into them.
So nothing new being exposed here as well.

Note that TP_printk() kept unchanged, so typical user tracing
that parses trace_pipe won't see any difference.
Apps that use binary apis are using libs like libtracecmd and also fine.