Re: [PATCH] tracing/function-graph-tracer: tracing only syscallsmode

From: Ingo Molnar
Date: Fri Jan 02 2009 - 14:40:44 EST



* Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:

> Impact: make more easy the syscalls tracing
>
> This patch extends to syscalls the features which let one to trace
> only one ore more function (and those they call).
> This way we can get rid of the workqueues, kernel threads, softirqs from
> ksoftirqd and only have the syscall path on the trace.
>
> Note that hardirq that interrupt the syscalls are still traced. But we could
> add an option to disable the interrupt tracing as well in the future.
>
> To use this new feature:
>
> echo function_graph > /debugfs/tracing/current_tracer
> echo syscalls > /debugfs/tracing/set_graph_function

nice. I think a key item is missing though:

> @@ -1397,6 +1398,9 @@ asmregparm long syscall_trace_enter(struct pt_regs *regs)
> {
> long ret = 0;
>
> + /* Tell the function graph tracer we are entering a system call */
> + ftrace_graph_syscall_enter();

We could actually trace the _syscalls_ themselves, a'la strace:

munmap(0xb7ff3000, 4096) = 0

So instead of the current ftrace output:

0) cc1-30212 | | sys32_mmap2() {
0) cc1-30212 | | down_write() {
0) cc1-30212 | 0.272 us | _spin_lock_irq();
0) cc1-30212 | 0.969 us | }
0) cc1-30212 | | do_mmap_pgoff() {

We could get something like:

0) cc1-30212 | | sys32_mmap2(0xb7ff3000, 4096) {
0) cc1-30212 | | down_write() {
0) cc1-30212 | 0.272 us | _spin_lock_irq();
0) cc1-30212 | 0.969 us | }
0) cc1-30212 | | do_mmap_pgoff() {
[...]
0) cc1-30212 | + 22.537 us | } = 0


Or maybe as separate trace entries:

0) cc1-30212 | |> sys32_mmap2 [0xb7ff3000, 4096]
0) cc1-30212 | | sys32_mmap2 {
0) cc1-30212 | | down_write() {
0) cc1-30212 | 0.272 us | _spin_lock_irq();
0) cc1-30212 | 0.969 us | }
0) cc1-30212 | | do_mmap_pgoff() {
[...]
0) cc1-30212 | + 22.537 us | }
0) cc1-30212 | |< sys32_mmap2 => 0

For that we'd have to pass in something like the syscall function address
(sys_call_table[regs->ax]), and the up to 6 parameters
[regs->bx,cx,dx,si,di,bp].

(and initially we could just print all of them i guess, instead of a
variable-width thing)

We wouldnt do smart decoding of syscall arguments normally - just print
the raw arguments with no decoding. (like strace -e raw=all)

How does this sound?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/