Re: [PATCH 6/6] ftrace: take advantage of variable length entries

From: Steven Rostedt
Date: Tue Sep 30 2008 - 13:45:39 EST



On Tue, 30 Sep 2008, Pekka Paalanen wrote:

> Hi Steven,
>
> This is great news. I have some questions below.

Thanks!

> On Mon, 29 Sep 2008 23:02:42 -0400
> Steven Rostedt <rostedt@xxxxxxxxxxx> wrote:
>
> > Now that the underlining ring buffer for ftrace now hold variable length
> > entries, we can take advantage of this by only storing the size of the
> > actual event into the buffer. This happens to increase the number of
> > entries in the buffer dramatically.
>
> I've read this paragraph over and over, and I can't really understand
> "only storing the size of the ... event". Maybe you are trying to say
> that now we are not consuming more ring space than the actual size of
> the event?
>
> Oh! I misunderstood also the last sentence, it's not generating extra
> entries but there is room for more entries than before.

Yeah, it means that we can now store more entries with the same space.

> >
> > @@ -71,10 +89,34 @@ struct stack_entry {
> > * ftrace_printk entry:
> > */
> > struct print_entry {
> > + struct trace_entry ent;
> > unsigned long ip;
> > char buf[];
> > };
> >
> > +#define TRACE_OLD_SIZE 88
>
> I'm just wondering, does this hold for both 32- and 64-bit kernels?
> Long type is quite common in the structs. I realize this is going
> away, so maybe it doesn't even matter.

I believe 88 was what I came up with on x86_64. It was just a size to
pick out of the blue and has no real meaning. I'm not sure anyone uses it
but before I hacked it away, I wanted to make sure.

> > -/*
> > - * The trace field - the most basic unit of tracing. This is what
> > - * is printed in the end as a single line in the trace output, such as:
> > - *
> > - * bash-15816 [01] 235.197585: idle_cpu <- irq_enter
> > - */
> > -struct trace_field {
> > - char cpu;
> > - char flags;
> > - char preempt_count;
> > - int pid;
> > - union {
> > - struct ftrace_entry fn;
> > - struct ctx_switch_entry ctx;
> > - struct special_entry special;
> > - struct stack_entry stack;
> > - struct print_entry print;
> > - struct mmiotrace_rw mmiorw;
> > - struct mmiotrace_map mmiomap;
> > - struct boot_trace initcall;
> > - };
> > -};
>
> Finally we get rid of the union :-)

Yeah, this is my favorite part of the patch.

> > ret = trace_seq_printf(s,
> > "R %d %lu.%06lu %d 0x%llx 0x%lx 0x%lx %d\n",
> > @@ -220,14 +222,14 @@ static int mmio_print_rw(struct trace_it
> > static int mmio_print_map(struct trace_iterator *iter)
> > {
> > struct trace_entry *entry = iter->ent;
> > - struct mmiotrace_map *m = &entry->field.mmiomap;
> > + struct mmiotrace_map *m = (struct mmiotrace_map *)entry;
>
> This is different style than above, missing the struct
> trace_mmiotrace_map intermediate step. Looks like a bug,
> since struct mmiotrace_map is not the first field in
> struct trace_mmiotrace_map.

Crap! yes this is a bug. Thanks for pointing this out.

> > @@ -252,7 +254,8 @@ static int mmio_print_map(struct trace_i
> > static int mmio_print_mark(struct trace_iterator *iter)
> > {
> > struct trace_entry *entry = iter->ent;
> > - const char *msg = entry->field.print.buf;
> > + struct print_entry *print = (struct print_entry *)entry;
> > + const char *msg = print->buf;
> > struct trace_seq *s = &iter->seq;
> > unsigned long long t = ns2usecs(iter->ts);
> > unsigned long usec_rem = do_div(t, 1000000ULL);
> > @@ -264,7 +267,7 @@ static int mmio_print_mark(struct trace_
> > if (!ret)
> > return 0;
> >
> > - if (entry->field.flags & TRACE_FLAG_CONT)
> > + if (entry->flags & TRACE_FLAG_CONT)
> > trace_seq_print_cont(s, iter);
>
> If we have variable length entries, doesn't that mean that
> print_entry::buf will always hold the full nil-terminated message,
> and that we could simply remove all the trace_seq_print_cont()
> calls?
>
> In fact, TRACE_FLAG_CONT will never be set. I didn't dig up the
> other patches to verify, but trace_vprintk() isn't setting it
> anymore.

This patch is bringing us to the stage to get rid of the CONT flag.
I kept this in just to not break more than I need to ;-)

> > @@ -336,7 +339,7 @@ static void __trace_mmiotrace_map(struct
> > struct mmiotrace_map *map)
> > {
> > struct ring_buffer_event *event;
> > - struct trace_entry *entry;
> > + struct trace_mmiotrace_map *entry;
> > unsigned long irq_flags;
> >
> > event = ring_buffer_lock_reserve(tr->buffer, sizeof(*entry),
> > @@ -344,9 +347,9 @@ static void __trace_mmiotrace_map(struct
> > if (!event)
> > return;
> > entry = ring_buffer_event_data(event);
> > - tracing_generic_entry_update(entry, 0);
> > - entry->type = TRACE_MMIO_MAP;
> > - entry->field.mmiomap = *map;
> > + tracing_generic_entry_update(&entry->ent, 0);
> > + entry->ent.type = TRACE_MMIO_MAP;
> > + entry->map = *map;
> > ring_buffer_unlock_commit(tr->buffer, event, irq_flags);
> >
> > trace_wake_up();
>
> On the mmiotrace part, I don't see anything wrong here apart
> from that one bug.

Thanks, I'll add a patch to my coming queue to handle this.

-- Steve

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/