Re: Unified tracing buffer

From: Mathieu Desnoyers
Date: Mon Sep 22 2008 - 23:27:23 EST


* Steven Rostedt (rostedt@xxxxxxxxxxx) wrote:
>
> On Mon, 22 Sep 2008, Mathieu Desnoyers wrote:
> > * Steven Rostedt (rostedt@xxxxxxxxxxx) wrote:
[...]
> > > and then to turn on function tracing, I need to hook into this marker. I'd
> > > rather just push the data right into the buffer here without having to
> > > make another function call to hook into this.
> > >
> >
> > The scheme you propose here is based on a few inherent assumptions :
> >
> > - You assume ring_buffer_reserve() and ring_buffer_commit() are static
> > inline and thus does not turn into function calls.
> > - You assume these are small enough so they can be inlined without
> > causing L1 insn cache trashing when tracing is activated.
> > - You therefore assume they use a locking scheme that lets them be
> > really really compact (e.g. interrupt disable and spin lock).
> > - You assume that the performance impact of doing a function call is
> > bigger than the impact of locking, which is false by at least a factor
> > 10.
>
> I don't assume anything. I will have the requirement that reserve and
> commit must be paired, and for the first version, hold locks.
>

By saying you don't want to do any function call, the only technical
reason I see for you wanting that is performance, and thus you would
assume the above. If not, why don't you want to make another function
call ? This all I mean by "assumption" here.

> Maybe I should rename it to: ring_buffer_lock_reserve and
> ring_buffer_unlock_commit. To show this.
>
> [...]
>
> > kernel/trace/ftrace.c :
> >
> > /*
> > * the following macro would only do the "declaration" part of the
> > * markers, without doing all the function call stuff.
> > */
> > DECLARE_MARKER(function_entry,
> > "pid %d pc %d flags %lu func 0x%lX parent 0x%lX");
> >
> > void ftrace_mcount(unsigned long ip, unsigned long parent_ip)
> > {
> > size_t ev_size = 0;
> > char *buffer;
> >
> > /*
> > * We assume event payload aligned on sizeof(void *).
> > * Event size calculated statically.
> > */
> > ev_size += sizeof(int);
> > ev_size += var_align(ev_size, sizeof(int));
> > ev_size += sizeof(int);
> > ev_size += var_align(ev_size, sizeof(unsigned long));
> > ev_size += sizeof(unsigned long);
> > ev_size += var_align(ev_size, sizeof(unsigned long));
> > ev_size += sizeof(unsigned long);
> > ev_size += var_align(ev_size, sizeof(unsigned long));
> > ev_size += sizeof(unsigned long);
> >
> > /*
> > * Now reserve space and copy data.
> > */
> > buffer = ring_buffer_reserve(func_event_id, ev_size);
> > /* Write pid */
> > *(int *)buffer = current->pid;
> > buffer += sizeof(int);
> >
> > /* Write pc */
> > buffer += var_align(buffer, sizeof(int));
> > *(int *)buffer = preempt_count();
> > buffer += sizeof(int);
> >
> > /* Write flags */
> > buffer += var_align(buffer, sizeof(unsigned long));
> > *(unsigned long *)buffer = local_irq_flags();
> > buffer += sizeof(unsigned long);
> >
> > /* Write func */
> > buffer += var_align(buffer, sizeof(unsigned long));
> > *(unsigned long *)buffer = func;
> > buffer += sizeof(unsigned long);
> >
> > /* Write parent */
> > buffer += var_align(buffer, sizeof(unsigned long));
> > *(unsigned long *)buffer = parent;
> > buffer += sizeof(unsigned long);
> >
> > ring_buffer_commit(buffer, ev_size);
> > }
> >
> >
> > Would that be suitable for you ?
>
> YUCK YUCK YUCK!!!!
>
> Mathieu,
>
> Do I have to bring up the argument of simplicity again? I will never use
> such an API. Mine was very simple, I have to spend 10 minutes trying to
> figure out what the above is. I only spent 5 so I'm still at a lost.
>

I was actually waiting for you to propose an alternative, but I fear you
already did without me noticing :)

How do you deal with exporting data across kernel/user boundary in your
proposal exactly ? How does this work on architecture with 64-bits
kernel and 32-bits userland... ? A simple C structure copy might be
simple to _code_, but hellish to export to userspace and lead to hard to
debug binary incompatibilities (different gcc flags, 32/64 bits
user/kernel). And this is without telling about the non-portability of
the exported data.

If gcc/icc-knowledgeful people can reassure me by certifying it won't
generate a mess, fine, but until then, I stay very doubtful about
solutions involving to imply binary compability between kernel and
userland.

And common.. 10 minutes to understand the above code. Your _are_ kidding
me right ? Would that help if I create a small 4 lineish wrapper around
the buffer write ?

Mathieu

--
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/