Re: [RFC PATCH v4 1/2] arm64: Introduce stack trace reliability checks in the unwinder

From: Madhavan T. Venkataraman
Date: Fri May 21 2021 - 13:47:18 EST




On 5/21/21 12:42 PM, Mark Brown wrote:
> On Fri, May 21, 2021 at 12:23:52PM -0500, Madhavan T. Venkataraman wrote:
>> On 5/21/21 11:11 AM, Mark Brown wrote:
>>> On Sat, May 15, 2021 at 11:00:17PM -0500, madvenka@xxxxxxxxxxxxxxxxxxx wrote:
>
>>>> + frame->reliable = true;
>
>>> All these checks are good checks but as you say there's more stuff that
>>> we need to add (like your patch 2 here) so I'm slightly nervous about
>
>> OK. So how about changing the field from a flag to an enum that says exactly
>> what happened with the frame?
>
> TBH I think the code is fine, or rather will be fine when it gets as far
> as actually being used - this was more a comment about when we flip this
> switch.
>

OK.

>> Also, the caller can get an exact idea of why the stack trace failed.
>
> I'm not sure anything other than someone debugging things will care
> enough to get the code out and then decode it so it seems like it'd be
> more trouble than it's worth, we're unlikely to be logging the code as
> standard.
>

OK.

>>> The other thing I guess is the question of if we want to bother flagging
>>> frames as unrelaible when we return an error; I don't see an issue with
>>> it and it may turn out to make it easier to do something in the future
>>> so I'm fine with that
>
>> Initially, I thought that there is no need to flag it for errors. But Josh
>> had a comment that the stack trace is indeed unreliable on errors. Again, the
>> word unreliable is the one causing the problem.
>
> My understanding there is that arch_stack_walk_reliable() should be
> returning an error if either the unwinder detected an error or if any
> frame in the stack is flagged as unreliable so from the point of view of
> users it's just looking at the error code, it's more that there's no
> need for arch_stack_walk_reliable() to consider the reliability
> information if an error has been detected and nothing else looks at the
> reliability information.
>
> Like I say we may come up with some use for the flag in error cases in
> future so I'm not opposed to keeping the accounting there.
>

So, should I leave it the way it is now? Or should I not set reliable = false
for errors? Which one do you prefer?

Josh,

Are you OK with not flagging reliable = false for errors in unwind_frame()?

Madhavan