Re: [PATCH 3/3] perf events: add timehist option to record and report

From: David Ahern
Date: Fri Feb 18 2011 - 14:53:55 EST




On 02/18/11 12:24, Frederic Weisbecker wrote:
>> We want not only context-switch events, but the stack trace at the
>> switch. For example, with the stack trace you can see preemption -- when
>> a process got booted by another and where the booted process is at the
>> time. You can see not only which system call caused an ap to block
>> (e.g., an ioctl) but the full code path leading to the block.
>
> You can recognize preemption a the context switch tracepoint: if the state
> of the scheduled out task is R (TASK_RUNNING), which means it doesn't go
> to sleep but gets preempted, with an explicit preemption point like cond_resched(),
> or a more implicit one: spin_unlock(), preempt_enable(), an interrupt, ...
> Or it has been woken up while it was about to sleep, but it doesn't make much
> difference.
>
> If you want to know when a process is booted by another you can use the
> fork tracepoint, or sched:wake_up_new, etc...
>
> And you can use syscall tracepoints to get the syscalls you want.
>
> I don't see much the point for you to use stacktraces. But if you
> do, then rather add this support to perf script, in our scripting
> framework.

It's more the simplicity of what we are using today. 1 command, 1 event
being monitored:

perf record -ag -e cs -c 1

A wealth of information. That command shows preemption, stack traces
only for context-switches (not all of the syscalls which is
overwhelming) and opens the door for other analysis. One data set.
Simple. Focused.

>
> Because what you've done is basically to add tracing support to
> perf report. But we have perf script for that already. It only focuses
> on tracepoint events but they are those people are interested in
> because they show logical events in the kernel. I guess
> people are not interested in cpu-cycles overflows events or so as
> they don't show a state change in the kernel.

I have always referred to this as pretty printing each sample recorded
as opposed to summarizing into a histogram. With that approach you have
dictated the analysis of the data - a histogram summary. By printing
each sample with address-symbol conversions we can look at it in
whatever angle we need to make sense of it.

David


>
> Well, yeah I can understand if one considers the software events,
> that makes meaningful events from the kernel. But these software events
> support have been a mistake in perf. You should rather use the
> tracepoint events instead.
>
>> That data along with the gettimeofday timestamp has allowed us to
>> resolve performance issues such as a system call taking longer than
>> expected during a specific sequence of events or a process getting
>> preempted and not scheduled for N seconds. etc., etc.
>
> That's about the same here. If you really need this, you need to add
> the support in perf script to handle that on tracepoint events.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-perf-users" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/