Re: [PATCH 4/4] perf core: Add backward attribute to perf event

From: Alexei Starovoitov
Date: Tue Mar 29 2016 - 00:59:21 EST


On Tue, Mar 29, 2016 at 10:01:24AM +0800, Wangnan (F) wrote:
>
>
> On 2016/3/28 14:41, Wang Nan wrote:
>
> [SNIP]
>
> >
> >To prevent this problem, we need to find a way to ensure the ring buffer
> >is stable during reading. ioctl(PERF_EVENT_IOC_PAUSE_OUTPUT) is
> >suggested because its overhead is lower than
> >ioctl(PERF_EVENT_IOC_ENABLE).
> >
>
> Add comment:
>
> By carefully verifying 'header' pointer, reader can avoid pausing the
> ring-buffer. For example:
>
> /* A union of all possible events */
> union perf_event event;
>
> p = head = perf_mmap__read_head();
> while (true) {
> /* copy header of next event */
> fetch(&event.header, p, sizeof(event.header));
>
> /* read 'head' pointer */
> head = perf_mmap__read_head();
>
> /* check overwritten: is the header good? */
> if (!verify(sizeof(event.header), p, head))
> break;
>
> /* copy the whole event */
> fetch(&event, p, event.header.size);
>
> /* read 'head' pointer again */
> head = perf_mmap__read_head();
>
> /* is the whole event good? */
> if (!verify(event.header.size, p, head))
> break;
> p += event.header.size;
> }
>
> However, the overhead is high because:
>
> a) In-place decoding is unsafe. Copy-verifying-decode is required.
> b) Fetching 'head' pointer requires additional synchronization.

Such trick may work, but pause is needed for more than stability
of reading. When we collect the events into overwrite buffer
we're waiting for some other trigger (like all cpu utilization
spike or just one cpu running and all others are idle) and when
it happens the buffer has valuable info from the past. At this
point new events are no longer interesting and buffer should
be paused, events read and unpaused until next trigger comes.