Re: [RFC PATCH v4 1/2] arm64: Introduce stack trace reliability checks in the unwinder
From: Josh Poimboeuf
Date: Fri May 21 2021 - 15:16:31 EST
On Fri, May 21, 2021 at 02:11:45PM -0500, Josh Poimboeuf wrote:
> On Fri, May 21, 2021 at 01:59:16PM -0500, Madhavan T. Venkataraman wrote:
> >
> >
> > On 5/21/21 1:48 PM, Josh Poimboeuf wrote:
> > > On Fri, May 21, 2021 at 06:53:18PM +0100, Mark Brown wrote:
> > >> On Fri, May 21, 2021 at 12:47:13PM -0500, Madhavan T. Venkataraman wrote:
> > >>> On 5/21/21 12:42 PM, Mark Brown wrote:
> > >>
> > >>>> Like I say we may come up with some use for the flag in error cases in
> > >>>> future so I'm not opposed to keeping the accounting there.
> > >>
> > >>> So, should I leave it the way it is now? Or should I not set reliable = false
> > >>> for errors? Which one do you prefer?
> > >>
> > >>> Josh,
> > >>
> > >>> Are you OK with not flagging reliable = false for errors in unwind_frame()?
> > >>
> > >> I think it's fine to leave it as it is.
> > >
> > > Either way works for me, but if you remove those 'reliable = false'
> > > statements for stack corruption then, IIRC, the caller would still have
> > > some confusion between the end of stack error (-ENOENT) and the other
> > > errors (-EINVAL).
> > >
> >
> > I will leave it the way it is. That is, I will do reliable = false on errors
> > like you suggested.
> >
> > > So the caller would have to know that -ENOENT really means success.
> > > Which, to me, seems kind of flaky.
> > >
> >
> > Actually, that is why -ENOENT was introduced - to indicate successful
> > stack trace termination. A return value of 0 is for continuing with
> > the stack trace. A non-zero value is for terminating the stack trace.
> >
> > So, either we return a positive value (say 1) to indicate successful
> > termination. Or, we return -ENOENT to say no more stack frames left.
> > I guess -ENOENT was chosen.
>
> I see. So it's a tri-state return value, and frame->reliable is
> intended to be a private interface not checked by the callers.
Or is frame->reliable supposed to be checked after all? Looking at the
code again, I'm not sure.
Either way it would be good to document the interface more clearly in a
comment above the function.
--
Josh