Re: [RFC PATCH 1/2] Marker probes in futex.c

From: Peter Zijlstra
Date: Wed Apr 16 2008 - 11:51:38 EST


On Tue, 2008-04-15 at 09:25 -0400, Mathieu Desnoyers wrote:
> * Peter Zijlstra (a.p.zijlstra@xxxxxxxxx) wrote:
> > On Tue, 2008-04-15 at 08:32 -0400, Mathieu Desnoyers wrote:
> > > * Peter Zijlstra (a.p.zijlstra@xxxxxxxxx) wrote:
> > > > On Tue, 2008-04-15 at 17:23 +0530, K. Prasad wrote:
> > > >
> > > > > + trace_mark(futex_wait_called, "uaddr:%p fshared:%p val:%u "
> > > > > + "abs_time:%p bitset:%d",
> > > > > + uaddr, fshared, val, abs_time, bitset);
> > > >
> > > > This is some seriuosly ugly looking gunk, why would we want stuff like
> > > > that scattered across the code?
> > > >
> > >
> > > I don't really see how it differs so much from printks, which kernel
> > > developers are already familiar with.
> >
> > Which never last longer than the debug session and are never exposed to
> > enterprise kABI crap.
> >
> > > > What is wrong with a few simple hooks like:
> > > >
> > > > trace_futex_wait(uaddr, fshares, val, abs_time, bitset);
> > > >
> > > > and then deal with that.
> > > >
> > >
> > > If any of your variable type changes, then you are exporting an unknown
> > > data structure to user-space. _That_ would break a userspace tracer
> > > whenever you change any of these kernel variables and you don't want
> > > that.
> >
> > trace_futex_wait()'s signature would make the compiler issue a complaint
> > when the arguments suddenly changes type, no?
> >
>
> Yes, but then you would have to create new code for each event you want
> to trace. In the end, it would increase the icache footprint
> considerably and would also make addition of new events cumbersome.

Looking at patch 2/2 in this series that seems to be needed anyway. So
I'm not sure what adding all these character strings buy you.

> > > Exporting the field names and variable types helps to identify the
> > > variables by their given names rather than their respective order.
> > > Having the field type insures binary compatibility.
> > >
> > > Clearly we can turn your trace_futex_wait(uaddr, fshares, val, abs_time,
> > > bitset); into a trace_mark() with a simple define, and I don't see any
> > > problem with that. I just want to make sure the event name, field names
> > > and field types are exported, and this is done by markers. However, I
> > > wonder why none of the kernel printk() are turned into specialized
> > > defines to make the code "cleaner".. maybe it's because it is useful to
> > > have everything declared in one spot after all.
> >
> > I must be missing something here, printks don't need to look pretty
> > because they never see the light of lkml. They get ripped out as soon as
> > I understand what the heck happened.
> >
>
> The thing is that the trace_marks really fills two purpose : they
> extract information from the core kernel, which is meant to be shipped
> on production systems so tracing tools can report what is happening on
> the system and they also allow kernel hackers to add markers of their
> own, so they can extract information about specific events they are
> interested in along with the standard kernel instrumentation.
>
> So, part of it is meant to be standard kernel information, part of it
> can be used for debugging. And since the kernel code evolves through
> time, it makes sense to have an infrastructure flexible enough to follow
> these changes easily.

The trouble I have with these free form thingies is that its too hard to
keep them consistent.

So we have this horrible printf syntax; but even worse:


> ï+ trace_mark(futex_wake_called, "uaddr:%p fshared:%p nr_wake:%d "
> + "bitset:%d",
> + uaddr, fshared, nr_wake, bitset);


> + INIT_FUTEX_DEBUG_PROBE(futex_wake_called,
> + "uaddr:%p fshared:%p nr_wake:%d bitset:%d"),


Why the need to duplicate it; that's utter madness.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/