Re: Stack trace of csum_partial_copy_generic

From: Josh Poimboeuf
Date: Mon May 16 2016 - 14:28:11 EST


Hi Nikolay,

On Fri, May 13, 2016 at 02:07:47PM +0300, Nikolay Borisov wrote:
> Hello Josh,
>
> I'd like to ask you whether objtool is supposed to produce a
> warning when arch/x86/lib/csum-copy_64.o (produced from
> arch/x86/lib/csum-copy_64.S). Since I cannot see any specific
> usage of rbp for defining a stackframe. I'm chasing against
> poor performance of a network benchmark and this is what perf produces:
>
> # Overhead Command Shared Object Symbol
> # ........ ............... ..................... .............................................
> #
> 37.30% iperf [kernel.kallsyms] [k] csum_partial_copy_generic
> |
> --- csum_partial_copy_generic
> |
> |--99.98%-- 0x7f809108b7cd
> | |
> | |--69.72%-- 0x20000
> | |
> | --30.28%-- 0x7f809108b7c2
> | 0x20000
> --0.02%-- [...]
>
> So this is not very helpful in tracing where this is being
> called from. Presumably somewhere from the networking layer. So
> should objtool catch this or since csum_partial_copy_generic is a leaf
> function reliable stack trace isn't needed?

Right, since it's a leaf function, objtool ignores it and lets it do
whatever it wants with the frame pointer.

> Furthermore this function is called from C wrapper in
> csum-wrappers_64.c - shouldn't at least they be present in the
> callstack?

I suspect the problem is that it can't walk the stack because the
function overwrites the rbp register. Try replacing all uses of rbp in
that function with another register. r15?

(Another solution would be to tell perf to use DWARF unwinding instead
of frame pointers, but currently, kernel asm code doesn't have any DWARF
annotations. I'm planning on adding support for that soon in the 4.8
timeframe by generating DWARF metadata using objtool.)

--
Josh