Re: [PATCH 0/7] perf, x86: Haswell LBR call stack support
From: Peter Zijlstra
Date: Wed Jun 26 2013 - 11:30:04 EST
On Tue, Jun 25, 2013 at 04:47:12PM +0800, Yan, Zheng wrote:
> From: "Yan, Zheng" <zheng.z.yan@xxxxxxxxx>
>
> Haswell has a new feature that utilizes the existing Last Branch Record
> facility to record call chains. When the feature is enabled, function
> call will be collected as normal, but as return instructions are executed
> the last captured branch record is popped from the on-chip LBR registers.
> The LBR call stack facility can help perf to get call chains of progam
> without frame pointer. When perf tool requests PERF_SAMPLE_CALLCHAIN +
> PERF_SAMPLE_BRANCH_USER, this feature is dynamically enabled by default.
> This feature can be disabled/enabled through an attribute file in the cpu
> pmu sysfs directory.
>
> The LBR call stack has following known limitations
> 1. Zero length calls are not filtered out by hardware
> 2. Exception handing such as setjmp/longjmp will have calls/returns not
> match
> 3. Pushing different return address onto the stack will have calls/returns
> not match
>
You fail to mention what happens when the callstack is deeper than the
LBR is big -- a rather common issue I'd think.
>From what I gather if you push when full, the TOS rotates and eats the
tail allowing you to add another entry to the head.
If you pop when empty; nothing happens.
So on pretty much every program you'd be lucky to get the top of the
callstack but can end up with nearly nothing.
Given that, and the other limitations I don't think its a fair
replacement for user callchains.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/