Re: [PATCH/RFC v3] perf core: Allow setting up max frame stack depth via sysctl

From: Brendan Gregg
Date: Tue Apr 26 2016 - 17:59:18 EST


On Tue, Apr 26, 2016 at 2:05 PM, Arnaldo Carvalho de Melo
<arnaldo.melo@xxxxxxxxx> wrote:
> Em Tue, Apr 26, 2016 at 01:02:34PM -0700, Brendan Gregg escreveu:
>> On Mon, Apr 25, 2016 at 5:49 PM, Brendan Gregg <brendan.d.gregg@xxxxxxxxx> wrote:
>> > On Mon, Apr 25, 2016 at 5:47 PM, Arnaldo Carvalho de Melo <arnaldo.melo@xxxxxxxxx> wrote:
>> >> Em Mon, Apr 25, 2016 at 05:44:00PM -0700, Alexei Starovoitov escreveu:
>> >>> yep :)
>> >>> hopefully Brendan can give it another spin.
>> >>
>> >> Agreed, and I'm calling it a day anyway, Brendan, please consider
>> >> retesting, thanks,
>> >
>> > Will do, thanks!
>>
>> Looks good.
>>
>> I started with max depth = 512, and even that was still truncated, and
>> had to profile again at 1024 to capture the full stacks. Seems to
>> generally match the flame graph I generated with V1, which made me
>> want to check that I'm running the new patch, and am:
>>
>> # grep six_hundred_forty_kb /proc/kallsyms
>> ffffffff81c431e0 d six_hundred_forty_kb
>>
>> I was mucking around and was able to get "corrupted callchain.
>> skipping..." errors, but these look to be expected -- that was
>
> Yeah, thanks for testing!
>
> And since you talked about userspace without frame pointers, have you
> played with '--call-graph lbr'?

Not really. Isn't it only 16 levels deep max?

Most of our Linux is Xen guests (EC2), and I'd have to check if the
MSRs are available for LBR (perf record --call-graph lbr ... returns
"The sys_perf_event_open() syscall returned with 95 (Operation not
supported) for event (cpu-clock).", so I'd guess not, although many
other MSRs are exposed).

BTS seemed more promising (deeper stacks), and there's already Xen
support for it (need to boot the Xen host with vpmu=bts, preferably
vpmu=bts,arch for some PMCs as well :).

Brendan