Re: [PATCH v2] ARM: unwind: improve unwinders for noreturn case

From: Jiangfeng Xiao
Date: Wed Mar 20 2024 - 11:30:31 EST




On 2024/3/20 16:45, Russell King (Oracle) wrote:
> On Wed, Mar 20, 2024 at 11:44:38AM +0800, Jiangfeng Xiao wrote:
>> This is an off-by-one bug which is common in unwinders,
>> due to the fact that the address on the stack points
>> to the return address rather than the call address.
>>
>> So, for example, when the last instruction of a function
>> is a function call (e.g., to a noreturn function), it can
>> cause the unwinder to incorrectly try to unwind from
>> the function after the callee.
>>
>> foo:
>> ...
>> bl bar
>> ... end of function and thus next function ...
>>
>> which results in LR pointing into the next function.
>>
>> Fixed this by subtracting 1 from frmae->pc in the call frame
>> (but not exception frames) like ORC on x86 does.
>
> The reason that I'm not accepting this patch is because the above says
> that it fixes it by subtracting 1 from the PC value, but the patch is
> *way* more complicated than that and there's no explanation why.
>
> For example, the following are unexplained:
>
> - Why do we always need ex_frame

```
bar:
..
.. end of function bar ...

foo:
BUG();
.. end of function foo ...
```

For example, when the first instruction of function 'foo'
is a undefined instruction, after function 'foo' is executed
to trigger an exception, 'arm_get_current_stackframe' assigns
'regs->ARM_pc' to 'frame.pc'.

If we always decrement frame.pc by 1, unwinder will incorrectly
try to unwind from the function 'bar' before the function 'foo'.

So we need to 'ext_frame' to distinguish this case
where we don't need to subtract 1.


> - What is the purpose of the change in format string for the display of
> backtraces
```
unwind_frame(&frame);
dump_backtrace_entry(...from) //from = frame.pc
printk("...%pS\n", ...(void *)from);
```
%pB will do sprint_backtrace and print the symbol at (from - 1) address
%pS will do sprint_symbol_build_id and print the symbol at (from) address

In unwind_frame, although we use 'frame->pc - 1' to execute unwind_find_idx,
but frame->pc itself does not change, in the noreturn case, frame->pc still
pointing into the next function. So this is going to be replaced by %pB.

>
>>
>> Refer to the unwind_next_frame function in the unwind_orc.c
>>
>> Suggested-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
>> Link: https://lkml.kernel.org/lkml/20240305175846.qnyiru7uaa7itqba@treble/
>> Signed-off-by: Jiangfeng Xiao <xiaojiangfeng@xxxxxxxxxx>
>> ---
>> ChangeLog v1->v2
>> - stay printk("%s...", loglvl, ...)


Thank you for your suggestion.
I'll change the code to be more concise in my [patch v3].


>> --
>> 1.8.5.6
>>
>>
>