RE: [PATCH 02/10] perf tool: fix: Don't discard prev in backward mode

From: Liang, Kan
Date: Tue Oct 10 2017 - 15:36:29 EST


> > diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
> > index 33b8837..7d23cf5 100644
> > --- a/tools/perf/util/evlist.c
> > +++ b/tools/perf/util/evlist.c
> > @@ -742,13 +742,25 @@ static int
> > rb_find_range(void *data, int mask, u64 head, u64 old,
> > u64 *start, u64 *end, bool backward)
> > {
> > + int ret;
> > +
> > if (!backward) {
> > *start = old;
> > *end = head;
> > return 0;
> > }
> >
> > - return backward_rb_find_range(data, mask, head, start, end);
> > + ret = backward_rb_find_range(data, mask, head, start, end);
> > +
> > + /*
> > + * The start and end from backward_rb_find_range is the range for
> all
> > + * valid data in ring buffer.
> > + * However, part of the data is processed previously.
> > + * Reset the end to drop the processed data
> > + */
> > + *end = old;
> > +
>
> I don't understand this patch. What rb_find_range() wants to do is to
> find start
> and end pointer, so record__mmap_read() knows where to start reading and
> where to stop.
> In backward mode, 'start' pointer is clearly 'head' (because 'head' is
> the header of the
> last record kernel writes to the ring buffer), but the 'end' pointer is
> not very easy to
> find ('end' pointer is the last (the earliest) available record in the
> ring buffer).
> We have to parse the whole ring buffer to find the last available record
> and set its
> tail into 'end', this is what backward_rb_find_range() tries hard to do.

Yes.

> However,
> your patch unconditionally overwrites *end (which is set by
> backward_rb_find_range()),
> makes backward_rb_find_range() meaningless.
>

Think a bit more.
Yes, unconditionally overwrite has problem if kernel has already went
through the whole ring buffer and starts to overwrite.
However, I think rb_find_range/backward_rb_find_range still need to be
refined.
'old' is a good indicator for looking for the 'end' pointer.
It could speed up the backward_rb_find_range.

Also, do you think itâs really needed to read everything from ring buffer
every time?
I think what the tool want is the incremental. It doesnât want to save and
process duplicate data.

So I think itâs better to modify the backward_rb_find_range to return the
'start' and 'end' for incremental.

> In my mind when you decide to use backward ring buffer you must make it
> overwrite
> because you want the kernel silently overwrite old records without
> waking up perf, while
> still able to parse the ring buffer even if overwriting happens. The use
> case
> should be: perf runs in background, kernel silently ignores old records
> in the ring
> buffer until something bad happens, then perf dumps 'events before the
> bad thing'.
> Similar to fight recorder. In this use case, each dumping is
> independent. Even
> if one record is appears in previous dumping, it is harmless (and
> useful) to make
> it appears in following dumping again.

It's not harmless for --switch-output, especially if we limit the output size.

I'm still not sure why we need to save the same data in different dumps.
It's also harmful for the performance as well.

>
> If you really want to avoid record duplication, you need to changes
> record__mmap_read()'s
> logic. Now it complains "failed to keep up with mmap data" and avoid
> dumping data when
> size of newly generated data is larger than the size of the ring buffer.
> It is reasonable
> for forward ring buffer because in this case you lost the head of the
> first record, the
> whole ring buffer is unparseable. However, it is wrong in backward case.
> What you
> should do in this case is dumping the whole ring buffer.
>
> > + return ret;
> > }
> >
> > /*
>