a lttng success story

From: Johnicholas Hines
Date: Sun Jan 07 2007 - 13:25:56 EST


Hi.
I was really pleased to find an intermittent timing problem using
lttng, and maybe an example of how it can be used would be helpful to
people considering adopting it.

The system is a veterinary testing machine running Linux on an ARM
board (specifically the applied data systems sphere).

The symptom was that, under a certain workload, intermittently, a
timing constraint was being violated. Setting the processes to debug
mode (spitting out lots of messages) was too slow - many timing
constraints were violated, and the workload was unrunnable.

I ran the bug-revealing workload under lttng for 4 hours, and captured
one abnormal event (and thousands of normal events). I waded through
the lttng logs, and found that the reason for the pause was that the
target of a message had, in the abnormal case, not blocked on that fd.

Lttng reveals things about the state of the scheduler that are very
helpful in debugging multi-process, userspace, systems. It is faster
and reveals more information than anything special-purpose I could
have built. To repeat myself, maybe an example of how it can be used
would be helpful to people considering adopting it.


Thanks for your attention,

Johnicholas
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/