Re: [PATCH] tracing/kmemtrace: normalize the raw tracer event tothe unified tracing API

From: Ingo Molnar
Date: Tue Dec 30 2008 - 03:16:23 EST



* Pekka Enberg <penberg@xxxxxxxxxxxxxx> wrote:

> Hi Frederic,
>
> On Mon, 2008-12-29 at 23:09 +0100, Frederic Weisbecker wrote:
> > Pekka, note that I would be pleased to add statistical tracing on
> > this tracer, but I would need a hashtable, or an array, or a list, or whatever
> > iterable to insert the data into the stat tracing api.
> >
> > But I don't know your projects about this... whether you wanted to use a section
> > or something else...
>
> It really depends on what we're tracing. If we're interested in just the
> allocation hotspots, a section will do just fine. However, if we're
> tracing memory footprint, we need to take into store the object pointer
> returned from kmalloc() and kmem_cache_alloc() so we can update
> call-site statistics properly upon kfree().
>
> So I suppose we need both, a section for per call-site statistics and a
> hash table for the object -> call-site mapping.

1)

i think the call_site based tracking should be a built-in capability - the
branch tracer needs that too for example. That would also make it very
simple on the usage place: you wouldnt have to worry about sections in
slub.c/etc.

2)

i think a possibly useful intermediate object would be the slab cache
itself, which could be the basis for some highlevel stats too. It would
probably overlap /proc/slabinfo statistics but it's a natural part of this
abstraction i think.

3)

the most lowlevel (and hence most allocation-footprint sensitive) object
to track would be the memory object itself. I think the best approach
would be to do a static, limited size hash that could track up to N memory
objects.

The advantage of such an approach is that it does not impact allocation
patterns at all (besides the one-time allocation cost of the hash itself
during tracer startup).

The disadvantage is when an overflow happens: the sizing heuristics would
get the size correct most of the time anyway, so it's not a practical
issue. There would be some sort of sizing control similar to
/debug/tracing/buffer_size_kb, and a special trace entry that signals an
'overflow' of the hash table. (in that case we wont track certain objects
- but it would be clear from the trace output what happens and the hash
size can be adjusted.)

Another advantage would be that it would trivially not interact with any
allocator - because the hash itself would never 'allocate' in any dynamic
way. Either there are free entries available (in which case we use it), or
not - in which case we emit an hash-overflow trace entry.

And this too would be driven from ftrace mainly - the SLAB code would only
offer the alloc+free callbacks with the object IDs. [ and this means that
we could detect memory leaks by looking at the hash table and print out
the age of entries :-) ]

How does this sound to you?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/