Re: [PATCH] arch/arm64 : fix error in dump_backtrace
From: Dave P Martin
Date: Tue Nov 06 2018 - 06:32:59 EST
On Tue, Nov 06, 2018 at 11:00:19AM +0000, Mark Rutland wrote:
> On Tue, Nov 06, 2018 at 08:57:51AM +0000, Daniel Thompson wrote:
> > On Tue, Nov 06, 2018 at 08:39:01AM +0000, Mark Rutland wrote:
> > > On Tue, Nov 06, 2018 at 03:19:35PM +0800, Zhaoyang Huang wrote:
> > > > From: Zhaoyang Huang <zhaoyang.huang@xxxxxxxxxx>
> > > >
> > > > In some cases, the instruction of "bl foo1" will be the last one of the
> > > > foo2[1], which will cause the lr be the first instruction of the adjacent
> > > > foo3[2]. Hence, the backtrace will show the weird result as bellow[3].
> > > > The patch will fix it by miner 4 of the lr when dump_backtrace
> > >
> > > This has come up in the past (and a similar patch has been applied, then
> > > reverted).
> > >
> > > In general, we don't know that a function call was made via BL, and therefore
> > > cannot know that LR - 4 is the address of the caller. The caller could set up
> > > the LR as it likes, then B or BR to the callee, and depending on how the basic
> > > blocks get laid out in memory, LR - 4 might point at something completely
> > > different.
> > >
> > > More ideally, the compiler wouldn't end a function with a BL. When does that
> > > happen, and is there some way we could arrange for that to not happen? e.g.
> > > somehow pad a NOP after the BL.
> >
> > It's a consequence of having __noreturn isn't it? __noreturn frees the
> > compiler from the burden of having to produce a valid return stack... so
> > it doesn't and unwinding becomes hard.
>
> In that case, the compiler could equally just use B rather than BL,
> which this patch doesn't solve.
>
> The documentation for the GCC noreturn attribute [1] says:
>
> | In order to preserve backtraces, GCC will never turn calls to noreturn
> | functions into tail calls.
>
> ... so clearly it's not intended to mess up backtracing.
Which is a bit odd, since every call to a noreturn function is a tail-
call by definition, and other tail-calls are typically optimised to a B
(thus interfering with backtracing in all except the noreturn case).
Avoiding this would require a change to the compiler, and since there's
no obvious correct answer anyway, I guess we shouldn't hold our breath.
> IIUC we mostly use noreturn to prevent warings about uninitialised
> variables and such after a call to a noreturn function. I think
> optimization is a secondary concern.
Agreed.
> We could ask the GCC folk if they can ensure that a noreturn function
> call leave thes LR pointing into the caller, e.g. by padding with a NOP:
>
> BL<noreturn function>
> NOP
>
> That seems cheap enough, and would keep backtraces reliable.
-fpatchable-function-entry=1,1 does almost the right thing, by
inserting 1 NOP at the start of each function, and putting the function
entry point after that (1) NOP.
This works on gcc-8.1. I haven't experimented with earlier compilers.
It could be considered an abuse, since we'd not quite be using the
option for its intended purpose. It also causes every function
entry point to be misaligned due to the NOP insertion, which we may or
may not care about.
Cheers
---Dave
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.