Re: [PATCH v4 09/22] perf: Support overwrite mode for AUX area

From: Alexander Shishkin
Date: Tue Sep 09 2014 - 07:54:22 EST


Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:

> On Tue, Sep 09, 2014 at 12:40:39PM +0300, Alexander Shishkin wrote:
>> Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:
>>
>> > On Wed, Aug 20, 2014 at 03:36:06PM +0300, Alexander Shishkin wrote:
>> >
>> >> diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c
>> >> index 925f369947..5006caba63 100644
>> >> --- a/kernel/events/ring_buffer.c
>> >> +++ b/kernel/events/ring_buffer.c
>> >
>> >> @@ -294,9 +295,22 @@ void perf_aux_output_end(struct perf_output_handle *handle, unsigned long size,
>> >> bool truncated)
>> >> {
>> >> struct ring_buffer *rb = handle->rb;
>> >> + unsigned long aux_head;
>> >>
>> >> + aux_head = local_read(&rb->aux_head);
>> >> +
>> >> + if (rb->aux_overwrite) {
>> >> + local_set(&rb->aux_head, size);
>> >> +
>> >> + /*
>> >> + * Send a RECORD_AUX with size==0 to communicate aux_head
>> >> + * of this snapshot to userspace
>> >> + */
>> >> + perf_event_aux_event(handle->event, size, 0, truncated);
>> >
>> > Humm.. why not write a 'normal' AUX record?
>>
>> In this mode, the hardware is running in a circular buffer mode,
>> overwriting old data, so we don't actually know the size of the
>> snapshot, we have userspace figure it out later on (based on timestamps,
>> for example). I didn't want to configure PMI for this mode to avoid
>> overhead, but with PMI we can try to keep track of the overwrites and
>> try to infer the actual snapshot size in the kernel. For Intel PT. As
>> far as I can tell, ARM's scatter-gather trace-to-memory storing block
>> does not generate interrupts at all.
>
> Well, wouldn't the 'size' be basically the entire buffer. All you have
> to then provide is the head pointer.

Yes, that's what the code above is doing. We can replace size==0 with
size==$buffer_size to mean the same thing.

> Ideally you would also provide a
> tail pointer so you know when to stop, but I suppose you can infer that
> from the data stream itself?

The tail pointer is the problem I mentioned above, because it's either

- where we stopped the previous time
- head+1, if old data is overwritten

and in order to tell the difference, we need an interrupt.
We can infer where the new data starts from the timestamps in the trace
stream, so the decoder can take care of it (and that's how it's done at
the moment).

> If you can provide the tail you can indeed
> compute the size etc.. at which point you don't have to rely on parsing
> the stream etc.

Ideally, there wouldn't be too many adjacent snapshots so it won't even
be a problem. But yes, if we want to reliably know the tail/size, we
need an interrupt.

Regards,
--
Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/