Re: [BUG] perf report: ordered events and flushing bug

From: Stephane Eranian
Date: Thu Mar 12 2015 - 16:24:59 EST


On Thu, Mar 12, 2015 at 4:16 PM, Arnaldo Carvalho de Melo
<acme@xxxxxxxxxx> wrote:
> Em Thu, Mar 12, 2015 at 01:53:29PM -0600, David Ahern escreveu:
>> On 3/12/15 1:39 PM, Stephane Eranian wrote:
>> >What the point of having all the ordered event logic if you are saying events
>> >must be saved in order. I don't think there is a way to make that guarantee
>> >when monitoring multiple CPUs at the same time.
>>
>> The record command does not analyze the events, it just copies from
>> mmap to file in lumps per mmap. e.g., on a given round the perf data
>> file has events like this:
>>
>> 111112223344444444555566666F111111111
>> |<------- round --------->|^
>> |
>> finished round event -|
>>
>> where 11111 are events read from mmap1, 2222 are events from mmap2,
>> etc. F is the finished round event which a pass over all mmaps has
>> been done.
>>
>> So for mmap1 all of the 11111 events are in time order, then jumping
>> to mmap2 events the 2222 times are time sorted relative to mmap2 but
>> not relative to mmap1 events.
>>
>> The ordered events code sorts the clumps into a time based stream:
>> 123141641445124564234645656...
>
> And it does that because it merges all the mmap buffers into just one
> file...
>
> OK, for inserting MMAP events (or any other), I think one could either
> use perf inject and merge two perf.data files, both in order, or add a
> 'perf data merge' subcommand to 'perf data', perhaps the later will be
> useful in more cases.
>
> But there is something else here, we should take advantage of the fact
> that events in each perf mmap are ordered and keep that in the output of
> perf record, i.e. we should start one thread per CPU that will just
> write into a .perf.data/cpu-N file
>
> Then, when reading, we will do what I'll do for 'trace' and 'top', i.e.
> order the N cpus and go on processing in order, if you need that
> (tracing, perf top perhaps).
>
> Or do a first pass, get the lifetime events, aka the PERF_RECORD_
> metadata, stash in the struct machine rbtrees, as usual, but keeping a
> reference to all threadas, even the dead ones, which I guess is what
> Namhyung does in some way in his patchkit, then go wild processing the
> samples in parallel.
>
> So, I think for Stephane, right now, the easiest path to follow is to
> hack 'perf inject' to insert the MMAP events where he needs, right?
>
Well, I had that but wanted to avoid the extra step for the user.
I had that last week. I will go back to it and verify that this approach
also works in pipe mode.
Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/