Re: [RFC][PATCH 00/21] tracing: Inter-event (e.g. latency) support

From: Tom Zanussi
Date: Fri Feb 10 2017 - 13:58:36 EST


Hi Masami,

On Fri, 2017-02-10 at 18:34 +0900, Masami Hiramatsu wrote:
> On Fri, 10 Feb 2017 13:16:17 +0900
> Namhyung Kim <namhyung@xxxxxxxxxx> wrote:
>
> > > ====
> > > Example - wakeup latency
> > > ====
> > >
> > > This basically implements the -RT latency_hist 'wakeup_latency'
> > > histogram using the synthetic events, variables, and actions
> > > described. The output below is from a run of cyclictest using the
> > > following command:
> > >
> > > # rt-tests/cyclictest -p 80 -n -s -t 2
> > >
> > > What we're measuring the latency of is the time between when a
> > > thread (of cyclictest) is awakened and when it's scheduled in. To
> > > do that we add triggers to sched_wakeup and sched_switch with the
> > > appropriate variables, and on a matching sched_switch event,
> > > generate a synthetic 'wakeup_latency' event. Since it's just
> > > another trace event like any other, we can also define a histogram
> > > on that event, the output of which is what we see displayed when
> > > reading the wakeup_latency 'hist' file.
> > >
> > > First, we create a synthetic event called wakeup_latency, that
> > > references 3 variables from other events:
> > >
> > > # echo 'wakeup_latency lat=sched_switch:wakeup_lat \
> > > pid=sched_switch:woken_pid \
> > > prio=sched_switch:woken_prio' >> \
> > > /sys/kernel/debug/tracing/synthetic_events
> > >
> > > Next we add a trigger to sched_wakeup, which saves the value of the
> > > 'common_timestamp' when that event is hit in a variable, ts0. Note
> > > that this happens only when 'comm==cyclictest'.
> > >
> > > Also, 'common_timestamp' is a new field defined on every event (if
> > > needed - if there are no users of timestamps in a trace, timestamps
> > > won't be saved and there's no additional overhead from that).
> > >
> > > # echo 'hist:keys=pid:ts0=common_timestamp.usecs if \
> > > comm=="cyclictest"' >> \
> > > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger
> > >
> > > Next, we add a trigger to sched_switch. When the pid being switched
> > > to matches the pid woken up by a previous sched_wakeup event, this
> > > event grabs the ts0 saved on that event, takes the difference
> > > between it and the current sched_switch's common_timestamp, and
> > > assigns it to a new 'wakeup_lat' variable. It also saves a couple
> > > other variables and then invokes the onmatch().trace() action which
> > > generates a new wakeup_latency event using those variables.
> > >
> > > # echo 'hist:keys=woken_pid=next_pid:woken_prio=next_prio:\
> > > wakeup_lat=common_timestamp.usecs-ts0:onmatch().trace(wakeup_latency) \
> > > if next_comm=="cyclictest"' >> \
> > > /sys/kernel/debug/tracing/events/sched/sched_switch/trigger
> >
> > As Masami said, I think the syntax is a bit hard to understand. Also
> > it'd be nice to access an event field directly (i.e. not by adding a
> > field in a hist). Maybe we can use a prefix like '$' to identify hist
> > fields..
>
> Ah that's a nice idea!
>
> >
> > How about below?
> >
> > # echo 'wakeup_latency \
> > lat=sched_switch.$wakeup_lat \
> > pid=sched_switch.next_pid \
> > prio=sched_switch.next_prio' >> \
> > /sys/kernel/debug/tracing/synthetic_events
>
> Should we define these parameter assignment at this.point?
>
> I think this syntax binds wakeup_latency event to sched_switch too tight. I
> mean, if someone kicks this event from some other event, it may easily lose
> values.
> So, at this point, we will define event name and what parameters it has,
> until binding this event to onmatch().
>

Right, I agree this binding doesn't need to be done here, good idea to
defer it as below...

> > # echo 'hist: \
> > keys=pid: \
> > ts0=common_timestamp.usec \
> > if comm=="cyclictest"' >> \
> > /sys/kernel/debug/tracing/events/sched/sched_wakeup/trigger
> >
> > # echo 'hist: \
> > keys=next_pid: \
> > wakeup_lat=common_timestamp.usec-$ts0: \
> > onmatch(sched_wakeup).trace(wakeup_latency) \
>
> This one seems much better for me, but I would like to ask you call event
> directly from onmatch, like as
>
> "onmatch(sched_wakeup).wakeup_latency(wakeup_lat,next_pid,next_prio)"
>
> At this point, kernel will finalize the wakeup_latency event with wakeup_lat,
> next_pid and next_prio.
>

Yes, I like this much better - things are no longer so implicit and
therefore subject to confusion, and the syntax itself makes more sense,
even if it is a bit more verbose on the trigger, which is fine.

Thanks for taking the time to think about this and for suggesting these
great ideas..

Tom