Re: [PATCH] perf record: Add snapshot mode support for perf's regular events

From: David Ahern
Date: Wed Nov 25 2015 - 00:06:27 EST


On 11/24/15 8:50 PM, Wangnan (F) wrote:
Actually we are discussing about this problem.

For such tracking events (PERF_RECORD_FORK...), we have dummy event so
it is possible for us to receive tracking events from a separated
channel, therefore we don't have to parse every events to pick those
events out. Instead, we can process tracking events differently, then
more interesting things can be done. For example, squashing those tracking
events if it takes too much memory...

If you look at my daemon code I process task events (FORK, MMAP, EXIT) to maintain task state including flushing threads when they terminate. This is a trade-off to having the knowledge to pretty-print addresses (address to symbol resolution) yet not grow without bounds -- be it a file or memory.


Furthermore, there's another problem being discussed: if userspace
ringbuffer
is bytes based, parsing event is unavoidable. Without parsing event we are
unable to find the new 'head' pointer when overwriting. Instead, we are
thinking about a bucket-based ringbuffer that, let perf maintain a series
of bucket, each time 'poll' return, perf copies new events to the start of
a bucket. If all bucket is occupied, we drop the oldest bucket.
Bucket-based
ringbuffer watest some memory but can avoid event parsing.

And there's many other problems in this patch. For example, when SIGUSR2 is
received, we need to do something to let all perf events start dumping.
Current implementation can't ensure we receive events just before the
SIGUSR2 if we not set 'no-buffer'.

Also, output events are in one perf.data, which is not user friendly.
Our final goal is to make perf a daemonized moniter, which can run 7x24
in user's environment. Each time a glitch is detected, a framework sends
a signal to perf to get a perf.data from it perf. The framework manage
those perf.data like logrotate, help developer analysis those glitch.

Exactly. And that's why my daemon is written the way it is. It is intended to run 24x7x365. It retains the last N events which are dumped when some external trigger tells it to.

Arnaldo: you asked about an event in the stream but that is not possible. My scheduling daemon targets CPU usage prior to a significant event (what was running, how long, where, etc). The significant event in the motivating case was STP timeouts -- if stp daemon is not able to send BPDUs why? What was running leading up to the timeout. The point is something external to the perf daemon says 'hey, save the last N-events for analysis'.

This case sounds like a generalization of my problem with the desire to write a perf.data file instead of processing the events and dumping to a file. It is doable. For example, synthesize task events for all threads in memory and then write out the saved samples.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/