Re: [PATCH 0/3] ARM Coresight: Enhance ETM tracing control

From: Greg Kroah-Hartman
Date: Thu Dec 05 2013 - 15:17:05 EST


On Thu, Dec 05, 2013 at 03:12:50PM -0500, Christopher Covington wrote:
> Hi Greg,
>
> On 12/04/2013 11:01 PM, Greg Kroah-Hartman wrote:
> > On Wed, Dec 04, 2013 at 10:49:25PM -0500, Adrien Vergé wrote:
> >> 2013/12/4 Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>:
> >>> How much overhead does the existing tracing code have on ARM? Is ETM
> >>> still even needed? Why not just use ETM for the core tracing code
> >>> instead?
>
> I think support for the Embedded Trace Macrocell is desirable. (Maybe it's not
> necesarily *needed*, but in the same way that graphics and audio aren't
> necessarily needed when using a desktop machine.) Plugging the ETM into the
> core tracing code or maybe into the perf events framework would be
> interesting, but do these patches make that work any more difficult?

Well, these patches were incorrect, so that's not really a valid
question :)

And adding new features to code that is "dead" and should probably be
removed isn't a good idea, as I'm sure you can understand.

> >> Coresight ETM is not just faster than /sys/kernel/debug/tracing, it
> >> provides more detailed and customisable info. For instance, you can
> >> trace every load, store, instruction fetch, along with the number of
> >> cycles taken, with almost zero-overhead.
> >
> > Can't you already do that with the 'perf' tool the kernel provides
> > without the ETM driver?
>
> With perf one can get a count of how many instructions have been executed,
> with little overhead, but not the full list of opcodes and addresses.

Is that a limitation of perf on ARM or perf in general? For some reason
I thought I had seen this using perf on x86, but it's been a while since
I last used it.

> (One can also sample the Program Counter intermittently, which might
> suffice for performance analysis, but probably doesn't for most
> debugging use cases.) I think with perf one can have a handful of
> watchpoints looking at a very few loads and stores, with large
> overhead. As I understand it, ETM can handle arbitrarily large
> regions, with little overhead.

How much work is it to incorportate ETM into the perf framework? Don't
you think that this is a better thing to do overall, instead of having
duplicating interfaces for the same thing?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/