Re: [RFC PATCH 1/2 v2] Unified trace buffer

From: Linus Torvalds
Date: Thu Sep 25 2008 - 13:04:23 EST




On Thu, 25 Sep 2008, Steven Rostedt wrote:
> +
> +/**
> + * ring_buffer_event_length - return the length of the event
> + * @event: the event to get the length of
> + *
> + * Note, if the event is bigger than 256 bytes, the length
> + * can not be held in the shifted 5 bits. The length is then
> + * added as a short (unshifted) in the body.

The comment seems stale ;)

> +
> +/**
> + * ring_buffer_peek - peek at the next event to be read
> + * @iter: The ring buffer iterator
> + * @iter_next_cpu: The CPU that the next event belongs on
> + *
> + * This will return the event that will be read next, but does
> + * not increment the iterator.
> + */
> +struct ring_buffer_event *
> +ring_buffer_peek(struct ring_buffer *buffer, int cpu, u64 *ts)
> +{
> + struct ring_buffer_per_cpu *cpu_buffer;
> + struct ring_buffer_event *event;
> + u64 delta;
> +
> + cpu_buffer = buffer->buffers[cpu];
> +
> + again:
> + if (ring_buffer_per_cpu_empty(cpu_buffer))
> + return NULL;
> +
> + event = ring_buffer_head_event(cpu_buffer);
> +
> + switch (event->type) {
> + case RB_TYPE_PADDING:
> + ring_buffer_inc_page(buffer, &cpu_buffer->head_page);
> + rb_reset_read_page(cpu_buffer);
> + goto again;
> +
> + case RB_TYPE_TIME_EXTENT:
> + delta = event->data;
> + delta <<= TS_SHIFT;
> + delta += event->time_delta;
> + cpu_buffer->read_stamp += delta;
> + goto again;
> +
> + case RB_TYPE_TIME_STAMP:
> + /* FIXME: not implemented */
> + goto again;
> +
> + case RB_TYPE_SMALL_DATA:
> + case RB_TYPE_LARGE_DATA:
> + case RB_TYPE_STRING:
> + if (ts)
> + *ts = cpu_buffer->read_stamp + event->time_delta;
> + return event;

Your timestamp handling seems odd. You do it per-event, but I think it
should happen for all events, ie just do

*ts += event->time_delta;

_outside_ the case statement, and then in RB_TYPE_TIME_EXTENT you'd do
either

- relative:
*ts += event->data << TS_SHIFT;

- absolute timestamp events:
*ts = (event->data << TS_SHIFT) + event->time_delta;

but the bigger issue is that I think the timestamp should be relative to
the _previous_ event, not relative to the page start. IOW, you really
should accumulate them.

IOW, the base timestamp cannot be in the cpu_buffer, it needs to be in the
iterator data structure, since it updates as you walk over it.

Otherwise the extended TSC format will be _horrible_. You don't want to
add it in front of every event in the page just because you had a pause at
the beginning of the page. You want to have a running update, so that you
only need to add it after there was a pause.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/