Ph.D. dissertation "Low-Impact Operating System Tracing"

From: Mathieu Desnoyers
Date: Wed Jan 20 2010 - 17:27:38 EST


I defended my Ph.D. dissertation entitled "Low-Impact Operating System
Tracing" on December 4th, 2009. The dissertation, written in English, is
available online:

Desnoyers, Mathieu (2009), "Low-Impact Operating System Tracing". Ph.D.
dissertation, École Polytechnique de Montréal, [Online]. Available:
http://www.lttng.org/pub/thesis/desnoyers-dissertation-2009-12-v25.pdf

The resulting open source projects are:

- Linux Trace Toolkit Next Generation (LTTng), a LGPLv2.1/GPLv2 tracer
for the Linux kernel (http://www.lttng.org)
- Userspace RCU library (liburcu), a highly-scalable user-space
synchronization library, distributed under the LGPLv2.1 license
(http://www.lttng.org/urcu)


Research Summary

Computer systems, both at the hardware and software-levels, are becoming
increasingly complex. In the case of Linux, used in a large range of
applications, from small embedded devices to high-end servers, the size
of the operating system kernels increases, libraries are added, and
major software redesign is required to benefit from multi-core
architectures, which are found everywhere. As a result, the software
development industry and individual developers are facing problems which
resolution requires to understand the interaction between applications
and all components of an operating system.

In this thesis, we propose the LTTng (Linux Trace Toolkit next
generation) tracer as an answer to the industry and open source
community tracing needs. The low-intrusiveness of the tracer is a key
aspect to its usefulness, because we need to be able to reproduce, under
tracing, problems occurring in normal conditions. In some cases, users
leave tracers active at all times in production, which makes the tracer
overhead definitely critical. Our approach involves the design of
synchronization primitives that meet the low-impact requirements. The
linearly scalable and wait-free RCU (Read-Copy Update) synchronization
mechanism used by the LTTng tracer fulfills these requirements with
respect to data read. A custom-made buffer synchronization scheme is
proposed to extract tracing data while preserving linear scalability and
wait-free characteristics.

By measuring the LTTng impact, we demonstrate that it is possible to
create a tracer that satisfy all the following characteristics: low
latency, deterministic real-time impact (wait-free), small impact on
operating system throughput and linear scalability with the number of
cores. Experiments on various architectures show that this tracer is
portable.

We propose a general model for superscalar multi-core systems with
weakly-ordered memory accesses to perform formal verification of the RCU
correctness and wait-free guarantees by model-checking. The LTTng
buffering scheme is also formally verified for safety and progress.
Formal verification demonstrates that these algorithms allow reentrancy
from multiple execution contexts, ranging from standard thread to
non-maskable interrupts handlers, allowing a wide instrumentation
coverage of the operating system.


Acknowledgements

Thanks to Google, IBM Research, Ericsson, Autodesk, Natural Sciences and
Engineering Research Council of Canada and Defence Research and
Development Canada for funding this research.


Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/