Re: [RFC PATCH v3 1/4] arm64: Introduce stack trace reliability checks in the unwinder

From: Madhavan T. Venkataraman
Date: Tue May 04 2021 - 20:21:40 EST




On 5/4/21 7:07 PM, Josh Poimboeuf wrote:
> On Tue, May 04, 2021 at 06:13:39PM -0500, Madhavan T. Venkataraman wrote:
>>
>>
>> On 5/4/21 4:52 PM, Josh Poimboeuf wrote:
>>> On Mon, May 03, 2021 at 12:36:12PM -0500, madvenka@xxxxxxxxxxxxxxxxxxx wrote:
>>>> @@ -44,6 +44,8 @@ int notrace unwind_frame(struct task_struct *tsk, struct stackframe *frame)
>>>> unsigned long fp = frame->fp;
>>>> struct stack_info info;
>>>>
>>>> + frame->reliable = true;
>>>> +
>>>
>>> Why set 'reliable' to true on every invocation of unwind_frame()?
>>> Shouldn't it be remembered across frames?
>>>
>>
>> This is mainly for debug purposes in case a caller wants to print the whole stack and also
>> print which functions are unreliable. For livepatch, it does not make any difference. It will
>> quit as soon as it encounters an unreliable frame.
>
> Hm, ok. So 'frame->reliable' refers to the current frame, not the
> entire stack.
>

Yes.

>>> Also, it looks like there are several error scenarios where it returns
>>> -EINVAL but doesn't set 'reliable' to false.
>>>
>>
>> I wanted to make a distinction between an error situation (like stack corruption where unwinding
>> has to stop) and an unreliable situation (where unwinding can still proceed). E.g., when a
>> stack trace is taken for informational purposes or debug purposes, the unwinding will try to
>> proceed until either the stack trace ends or an error happens.
>
> Ok, but I don't understand how that relates to my comment.
>
> Why wouldn't a stack corruption like !on_accessible_stack() set
> 'frame->reliable' to false?
>

I do see your point. If an error has been hit, then the stack trace is essentially unreliable
regardless of anything else. So, I accept your comment. I will mark the stack trace as unreliable
if any kind of error is encountered.

Thanks!

Madhavan