Re: [PATCH v9] Unified trace buffer

From: Steven Rostedt
Date: Sat Sep 27 2008 - 15:25:13 EST

Next message: Steven Rostedt: "Re: [ath9k-devel] ath9k: massive unexplained latency in 2.6.27 (rc5,rc6, probably others)"
Previous message: Arjan van de Ven: "Re: [patch] ioremap sanity check to catch mapping requests exceedingthe BAR sizes"
In reply to: Ingo Molnar: "Re: [PATCH v9] Unified trace buffer"
Next in thread: Ingo Molnar: "Re: [PATCH v9] Unified trace buffer"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Hi Ingo,

Thanks for the review!

On Sat, 27 Sep 2008, Ingo Molnar wrote:

>
> small nitpicking review, nothing structural yet:
>
> * Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> > Index: linux-trace.git/include/linux/ring_buffer.h
> > +enum {
> > + RB_TYPE_PADDING, /* Left over page padding
>
> RB_ clashes with red-black tree namespace. (on the thought level)

Yeah, Linus pointed this out with the rb_ static function names. But since
the functions are static I kept them as is. But here we have global names.

Would RNGBF_ be OK, or do you have any other ideas?

>
> > +#define RB_ALIGNMENT_SHIFT 2
> > +#define RB_ALIGNMENT (1 << RB_ALIGNMENT_SHIFT)
> > +#define RB_MAX_SMALL_DATA (28)
>
> no need to put numeric literals into parenthesis.

Ah, I think I had it more complex and changed it to a literal without
removing the parenthesis.

>
> > +static inline unsigned
> > +ring_buffer_event_length(struct ring_buffer_event *event)
> > +{
> > + unsigned length;
> > +
> > + switch (event->type) {
> > + case RB_TYPE_PADDING:
> > + /* undefined */
> > + return -1;
> > +
> > + case RB_TYPE_TIME_EXTENT:
> > + return RB_LEN_TIME_EXTENT;
> > +
> > + case RB_TYPE_TIME_STAMP:
> > + return RB_LEN_TIME_STAMP;
> > +
> > + case RB_TYPE_DATA:
> > + if (event->len)
> > + length = event->len << RB_ALIGNMENT_SHIFT;
> > + else
> > + length = event->array[0];
> > + return length + RB_EVNT_HDR_SIZE;
> > + default:
> > + BUG();
> > + }
> > + /* not hit */
> > + return 0;
>
> too large, please uninline.

I calculated this on x86_64 to add 78 bytes. Is that still too big?

>
> > +static inline void *
> > +ring_buffer_event_data(struct ring_buffer_event *event)
> > +{
> > + BUG_ON(event->type != RB_TYPE_DATA);
> > + /* If length is in len field, then array[0] has the data */
> > + if (event->len)
> > + return (void *)&event->array[0];
> > + /* Otherwise length is in array[0] and array[1] has the data */
> > + return (void *)&event->array[1];
> > +}
>
> ditto.

No biggy. I thought this would be nicer as inline. But I have no problem
changing this.

>
> > +/* FIXME!!! */
> > +u64 ring_buffer_time_stamp(int cpu)
> > +{
> > + /* shift to debug/test normalization and TIME_EXTENTS */
> > + return sched_clock() << DEBUG_SHIFT;
>
> [ duly noted ;-) ]
>
> > +}
> > +void ring_buffer_normalize_time_stamp(int cpu, u64 *ts)
>
> needs extra newline above.

Yeah, I kept them bounded just to stress the "FIXME" part ;-)

>
> > +/*
> > + * head_page == tail_page && head == tail then buffer is empty.
> > + */
> > +struct ring_buffer_per_cpu {
> > + int cpu;
> > + struct ring_buffer *buffer;
> > + raw_spinlock_t lock;
>
> hm, should not be raw, at least initially. I am 95% sure we'll see
> lockups, we always did when we iterated ftrace's buffer implementation
> ;-)

It was to prevent lockdep from checking the locks from inside. We had
issues with ftroce and lockdep in the past, because ftrace would trace the
internals of lockdep, and lockdep would then recurse back into itself to
trace. If lockdep itself can get away with not using raw_spinlocks, then
this will be OK to make back to spinlock.

>
> > +struct ring_buffer {
> > + unsigned long size;
> > + unsigned pages;
> > + unsigned flags;
> > + int cpus;
> > + cpumask_t cpumask;
> > + atomic_t record_disabled;
> > +
> > + struct mutex mutex;
> > +
> > + struct ring_buffer_per_cpu **buffers;
> > +};
> > +
> > +struct ring_buffer_iter {
> > + struct ring_buffer_per_cpu *cpu_buffer;
> > + unsigned long head;
> > + struct buffer_page *head_page;
> > + u64 read_stamp;
>
> please use consistent vertical whitespaces. Above, in the struct
> ring_buffer definition, you can add another tab to most of the vars -
> that will also make the '**buffers' line look nice.

OK, will fix.

>
> same for all structs across this file. In my experience, a 50% vertical
> break works best - the one you used here in 'struct ring_buffer_iter'.
>
> > +};
> > +
> > +#define CHECK_COND(buffer, cond) \
> > + if (unlikely(cond)) { \
> > + atomic_inc(&buffer->record_disabled); \
> > + WARN_ON(1); \
> > + return -1; \
> > + }
>
> please name it RINGBUFFER_BUG_ON() / RINGBUFFER_WARN_ON(), so that we
> dont have to memorize another set of debug names. [ See
> DEBUG_LOCKS_WARN_ON() in include/linux/debug_locks.h ]

OK, this was a direct copy from what was used in ftrace.

>
> you can change it to:
>
> > +static int
> > +rb_allocate_pages(struct ring_buffer_per_cpu *cpu_buffer, unsigned nr_pages)
> > +{
> > + struct list_head *head = &cpu_buffer->pages;
> > + LIST_HEAD(pages);
> > + struct buffer_page *page, *tmp;
> > + unsigned long addr;
> > + unsigned i;
>
> please apply ftrace's standard reverse christmas tree style and move the
> 'pages' line down two lines.

Heh, this was directly from a bug I had and laziness ;-)
I originally just had struct list_head pages (and no *tmp), which kept the
christmas tree format. But later found that you need to initialize list
heads (duh!), and never moved it.

>
> > +int ring_buffer_resize(struct ring_buffer *buffer, unsigned long size)
> > +{
> > + struct ring_buffer_per_cpu *cpu_buffer;
> > + unsigned long buffer_size;
> > + LIST_HEAD(pages);
> > + unsigned long addr;
> > + unsigned nr_pages, rm_pages, new_pages;
> > + struct buffer_page *page, *tmp;
> > + int i, cpu;
>
> ditto.

Same reason.

>
> > +static inline void *rb_page_index(struct buffer_page *page, unsigned index)
> > +{
> > + void *addr;
> > +
> > + addr = page_address(&page->page);
>
> 'addr' initialization can move to the definition line - you save two
> lines.

Will fix.

>
> > + return addr + index;
> > +}
> > +
> > +static inline struct ring_buffer_event *
> > +rb_head_event(struct ring_buffer_per_cpu *cpu_buffer)
> > +{
> > + return rb_page_index(cpu_buffer->head_page,
> > + cpu_buffer->head);
>
> can all move to the same return line.

Ah, this was caused by my s/ring_buffer_page_index/rb_page_index/ run.

>
> > +}
> > +
> > +static inline struct ring_buffer_event *
> > +rb_iter_head_event(struct ring_buffer_iter *iter)
> > +{
> > + return rb_page_index(iter->head_page,
> > + iter->head);
>
> ditto.

Will fix.

>
> > + for (head = 0; head < rb_head_size(cpu_buffer);
> > + head += ring_buffer_event_length(event)) {
> > + event = rb_page_index(cpu_buffer->head_page, head);
> > + BUG_ON(rb_null_event(event));
>
> ( optional:when there's a multi-line loop then i generally try to insert
> an extra newline when starting the body - to make sure the iterator
> and the body stands apart visually. Matter of taste. )

Will fix, I have no preference.

>
> > +static struct ring_buffer_event *
> > +rb_reserve_next_event(struct ring_buffer_per_cpu *cpu_buffer,
> > + unsigned type, unsigned long length)
> > +{
> > + u64 ts, delta;
> > + struct ring_buffer_event *event;
> > + static int once;
> > +
> > + ts = ring_buffer_time_stamp(cpu_buffer->cpu);
> > +
> > + if (cpu_buffer->tail) {
> > + delta = ts - cpu_buffer->write_stamp;
> > +
> > + if (test_time_stamp(delta)) {
> > + if (unlikely(delta > (1ULL << 59) && !once++)) {
> > + printk(KERN_WARNING "Delta way too big! %llu"
> > + " ts=%llu write stamp = %llu\n",
> > + delta, ts, cpu_buffer->write_stamp);
> > + WARN_ON(1);
> > + }
> > + /*
> > + * The delta is too big, we to add a
> > + * new timestamp.
> > + */
> > + event = __rb_reserve_next(cpu_buffer,
> > + RB_TYPE_TIME_EXTENT,
> > + RB_LEN_TIME_EXTENT,
> > + &ts);
> > + if (!event)
> > + return NULL;
> > +
> > + /* check to see if we went to the next page */
> > + if (cpu_buffer->tail) {
> > + /* Still on same page, update timestamp */
> > + event->time_delta = delta & TS_MASK;
> > + event->array[0] = delta >> TS_SHIFT;
> > + /* commit the time event */
> > + cpu_buffer->tail +=
> > + ring_buffer_event_length(event);
> > + cpu_buffer->write_stamp = ts;
> > + delta = 0;
> > + }
> > + }
> > + } else {
> > + rb_add_stamp(cpu_buffer, &ts);
> > + delta = 0;
> > + }
> > +
> > + event = __rb_reserve_next(cpu_buffer, type, length, &ts);
> > + if (!event)
> > + return NULL;
> > +
> > + /* If the reserve went to the next page, our delta is zero */
> > + if (!cpu_buffer->tail)
> > + delta = 0;
> > +
> > + event->time_delta = delta;
> > +
> > + return event;
> > +}
>
> this function is too long, please split it up. The first condition's
> body could go into a separate function i guess.

Will fix.

>
> > + RB_TYPE_TIME_EXTENT, /* Extent the time delta
> > + * array[0] = time delta (28 .. 59)
> > + * size = 8 bytes
> > + */
>
> please use standard comment style:
>
> /*
> * Comment
> */

Hmm, this is interesting. I kind of like this because it is not really a
standard comment. It is a comment about the definitions of the enum. I
believe if they are above:

/*
* Comment
*/
RB_ENUM_TYPE,

It is not as readable. But if we do:

RB_ENUM_TYPE, /*
* Comment
*/

The comment is not at the same line as the enum, which also looks
unpleasing.

We can't could do:

/*
RB_ENUM_TYPE, * Comment
*/
/*
RB_ENUM_TYPE2, * Comment
*/

Because the ENUM is also in the comment :-p

I chose this way because we have:

RB_ENUM_TYPE, /* Comment
* More comment
*/
RB_ENUM_TYPE2, /* Comment
*/

Since I find this the nices way to describe enums. That last */ is
good to space the comments apart, otherwise we have:

RB_ENUM_TYPE, /* Comment
* More comment */
RB_ENUM_TYPE2, /* Comment */

That is not as easy to see the separation of one description of enums with
the other.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Steven Rostedt: "Re: [ath9k-devel] ath9k: massive unexplained latency in 2.6.27 (rc5,rc6, probably others)"
Previous message: Arjan van de Ven: "Re: [patch] ioremap sanity check to catch mapping requests exceedingthe BAR sizes"
In reply to: Ingo Molnar: "Re: [PATCH v9] Unified trace buffer"
Next in thread: Ingo Molnar: "Re: [PATCH v9] Unified trace buffer"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]