Re: [PATCH] perf record: Add snapshot mode support for perf's regular events

From: Wangnan (F)
Date: Wed Nov 25 2015 - 02:52:12 EST




On 2015/11/25 15:22, Adrian Hunter wrote:
On 25/11/15 05:50, Wangnan (F) wrote:

On 2015/11/24 23:20, Arnaldo Carvalho de Melo wrote:
Em Tue, Nov 24, 2015 at 08:06:41AM -0700, David Ahern escreveu:
On 11/24/15 7:00 AM, Yunlong Song wrote:
+static int record__write(struct record *rec, void *bf, size_t size)
+{
+ if (rec->memory.size && memory_enabled) {
+ if (perf_memory__write(&rec->memory, bf, size) < 0) {
+ pr_err("failed to write memory data, error: %m\n");
+ return -1;
+ }
+ } else {
+ if (perf_data_file__write(rec->session->file, bf, size) < 0) {
+ pr_err("failed to write perf data, error: %m\n");
+ return -1;
+ }
+ rec->bytes_written += size;
}

- rec->bytes_written += size;
return 0;
}

@@ -86,6 +214,8 @@ static int record__mmap_read(struct record *rec, int
idx)
if (old == head)
return 0;

+ memory_enabled = 1;
+
rec->samples++;

size = head - old;
@@ -113,6 +243,7 @@ static int record__mmap_read(struct record *rec, int
idx)
md->prev = old;
perf_evlist__mmap_consume(rec->evlist, idx);
out:
+ memory_enabled = 0;
return rc;
}

So you are basically ignoring all samples until SIGUSR2 is received. That
No, he is not, its just that his code is difficult to follow, has to be
rewritten, but he is ignoring just PERF_RECORD_SAMPLE events, so it
will..

means the resulting data file will have limited history of task events for
... have a complete history of task events, since PERF_RECORD_FORK, etc
are not being ignored.

No?
Actually we are discussing about this problem.

For such tracking events (PERF_RECORD_FORK...), we have dummy event so
it is possible for us to receive tracking events from a separated
channel, therefore we don't have to parse every events to pick those
events out. Instead, we can process tracking events differently, then
more interesting things can be done. For example, squashing those tracking
events if it takes too much memory...

Furthermore, there's another problem being discussed: if userspace ringbuffer
is bytes based, parsing event is unavoidable. Without parsing event we are
unable to find the new 'head' pointer when overwriting.
Have you considered trying to find the head by trial-and-error at the time
you make the snapshot i.e. look at the first 8 bytes (event records are 8
byte aligned) and see if it is a valid record header, if not try the next 8
bytes. When you find a real event record it should parse without error and
the subsequent events should all parse without error too, all the way to the
tail. Then you can use timestamps and compare the events byte-by-byte to
avoid overlaps between 2 snapshots.

It seems not work. Now we have BPF output event, it is possible that a
BPF program output anything through that event. Even if we have a magic
in head of each event, we can't prevent BPF output event output that
magic, except we introduce some 'escape' method to prevent BPF output
event output some data pattern. So although might work in reallife,
this solution is logically incorrect. Or am I miss someting?

Thank you.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/