[RFC PATCH v7 0/5] *** Latency histograms ***
From: Binoy Jayan
Date: Tue Sep 20 2016 - 09:52:13 EST
Latency Histograms
=====================
1. Introduction
----------------
This series consists of patches meant for capturing latencies which
occur in a realtime system and to render it as a histogram of meaningful
data. Few of these patches were originally part of the PREEMPT_RT patchet
and the rt linux tree. These are adapted to make use of the kernel event
infrastructure and histogram triggers. This patch series aims at preparing
these patches for mainlining. The individual changelog describe the specific
patchset. To check the feasibility of mainlining this series, please refer
section 5. The original patch set (Latency hist) may be found under section 6.
The patch series also consists of an additonal histogram feature and a bug fix.
v6: https://lkml.org/lkml/2016/9/7/253
2. Changes v6 -> v7:
---------------------
- Patch to capture additional per process data added as part of
wakeup latency patch
tracing: wakeup latency events and histograms
- Removed tracepoint condition '*_enabled' and let tracer core handle it
- Adjusted order of the new struct member 'struct hrtimer' to reduce footprint
- Documented new member 'tim_expiry' in 'struct hrtimer'
- For patch 3, use the timestamp from the tracer itself to avoid repeated
requests to hardware
- Removed unwanted type casts
- Added comments for new tracepoints
3. Questions which were unanswered in v6
-----------------------------------------
- How is this facility useful on it's own?
To capture potential sources of latencies
- What is the value of adding this?
The goal is not to document the deterministic execution time of irq
(the "IRQ latency"), but to provide evidence that both the scheduler
and the context switch guarantee that a waiting user-space real-time
application will start not later than the maximum acceptable latency.
This made it necessary to determine the "preemption latency" sometimes
also referred to as "total latency"
- How is it used and what valuable debug information can be collected?
Information collected are IRQ latency, Preemption latency and total latency
For more information, please refer 'Latency Hist article' in the reference
section mentioned below.
4. Switchtime latency
----------------------
The existing two on-line methods to record system latency (timer and
wakeup latency) do not completely reflect the user-space wakeup time,
since the duration of the context switch is not taken into account. This
patch adds two new histograms - one that records the duration of the
context switch and another one that records the sum of the timer delay,
the wakeup latency and the newly available duration of the context
switch. The latter histogram probably is best suitable to determine the
worst-case total preemption latency of a given system.
NB: This patch (refer 'Switchtime latency' under section 6) is presently
not part of the PREEMPT_RT patchset and also has not been adapted to work
with the kernel event infrastructure. It can be done if the other patches
can be considered for mainlining.
5. Mainlining and alternatives
-------------------------------
As of now, the preemptirqsoff, timer and wakeup latencies are part of
the PREEMPT_RT patch series. But not the switchtime latency which is
maintained separately in an off-tree repository. The preemptirqsoff,
and timer latency histogram alone may not make sense due to the reasons
mentioned in section 3. The implementation of these includes modifying
the 'task_struct' which might not sound feasible to many stakeholders.
Also, the switchtime histogram have not been adapted for mainlining
for the same reason. Refer 'Latency Hist article' for more details.
An alternative to keeping the timestamp in 'task_struct' as suggested
by Daniel Wagner (for example in wakeup histogram) is to create kprobe
events for 'wakeup' and 'wakeup_sched_switch' from userland and use the
'hist' triggers to somehow find the difference between the timestamps in
each of the above events. This could involve using tracing_maps by
creating a global variable to keep the process's context. But seems like
too much overhead during runtime. Moreover, this would involve
considerable work from the user space like many of the tracing
utilities and defeats the purpose of simplicity of the in-kernel histograms.
6. Reference
-------------
Latency hist: https://git.kernel.org/cgit/linux/kernel/git/rt/linux-stable-rt.git/commit/?h=v3.14-rt-rebase&id=56d50cc34943bbba12b8c5942ee1ae3b29f73acb
Latency Hist article: https://www.osadl.org/Single-View.111+M538f5aa49d6.0.html
Switchtime latency: http://www.spinics.net/lists/linux-rt-users/msg14377.html
-Binoy
Binoy Jayan (3):
tracing: Add preemptirqsoff timing events
tracing: Measure delayed hrtimer offset latency
tracing: wakeup latency events and histograms
Daniel Wagner (2):
tracing: Dereference pointers without RCU checks
tracing: Add hist trigger support for generic fields
include/linux/hrtimer.h | 4 +++
include/linux/rculist.h | 36 +++++++++++++++++++++
include/linux/sched.h | 6 ++++
include/linux/tracepoint.h | 4 +--
include/trace/events/latency.h | 62 +++++++++++++++++++++++++++++++++++++
include/trace/events/sched.h | 34 ++++++++++++++++++++
kernel/time/Kconfig | 8 +++++
kernel/time/hrtimer.c | 47 ++++++++++++++++++++++++++++
kernel/trace/Kconfig | 10 ++++++
kernel/trace/trace_events.c | 13 ++++----
kernel/trace/trace_events_filter.c | 4 +--
kernel/trace/trace_events_hist.c | 36 +++++++++++++++++----
kernel/trace/trace_events_trigger.c | 6 ++--
kernel/trace/trace_irqsoff.c | 43 ++++++++++++++++---------
kernel/trace/trace_sched_wakeup.c | 35 +++++++++++++++++++--
15 files changed, 312 insertions(+), 36 deletions(-)
create mode 100644 include/trace/events/latency.h
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project