Re: [PATCH v11 4/5] arm64: Introduce stack trace reliability checks in the unwinder

From: Madhavan T. Venkataraman
Date: Fri Nov 26 2021 - 12:25:51 EST




On 11/26/21 7:29 AM, Mark Brown wrote:
> On Thu, Nov 25, 2021 at 10:59:27AM -0600, Madhavan T. Venkataraman wrote:
>> On 11/25/21 8:56 AM, Mark Brown wrote:
>>> On Tue, Nov 23, 2021 at 01:37:22PM -0600, madvenka@xxxxxxxxxxxxxxxxxxx wrote:
>
>>> Probably also worth noting that this doesn't select
>>> HAVE_RELIABLE_STACKTRACE which is what any actual users are going to use
>>> to identify if the architecture has the feature. I would have been
>>> tempted to add arch_stack_walk() as a separate patch but equally having
>>> the user code there (even if it itself can't yet be used...) helps with
>>> reviewing the actual unwinder so I don't mind.
>
>> I did not select HAVE_RELIABLE_STACKTRACE just in case we think that some
>> more reliability checks need to be added. But if reviewers agree
>> that this patch series contains all the reliability checks we need, I
>> will add a patch to select HAVE_RELIABLE_STACKTRACE to the series.
>
> I agree that more checks probably need to be added, might be worth
> throwing that patch into the end of the series though to provide a place
> to discuss what exactly we need. My main thought here was that it's
> worth explicitly highlighting in this change that the Kconfig bit isn't
> glued up here so reviewers notice that's what's happening.
>

OK. I will add the patch to the next version.

>>>> +static void unwind_check_reliability(struct task_struct *task,
>>>> + struct stackframe *frame)
>>>> +{
>>>> + if (frame->fp == (unsigned long)task_pt_regs(task)->stackframe) {
>>>> + /* Final frame; no more unwind, no need to check reliability */
>>>> + return;
>>>> + }
>
>>> If the unwinder carries on for some reason (the code for that is
>>> elsewhere and may be updated separately...) then this will start
>>> checking again. I'm not sure if this is a *problem* as such but the
>>> thing about this being the final frame coupled with not actually
>>> explicitly stopping the unwind here makes me think this should at least
>>> be clearer, the comment begs the question about what happens if
>>> something decides it is not in fact the final frame.
>
>> I can address this by adding an explicit comment to that effect.
>> For example, define a separate function to check for the final frame:
>
>> /*
>> * Check if this is the final frame. Unwind must stop at the final
>> * frame.
>> */
>> static inline bool unwind_is_final_frame(struct task_struct *task,
>> struct stackframe *frame)
>> {
>> return frame->fp == (unsigned long)task_pt_regs(task)->stackframe;
>> }
>
>> Then, use this function in unwind_check_reliability() and unwind_continue().
>
>> Is this acceptable?
>
> Yes, I think that should address the issue - I'd have to see it in
> context to be sure but it does make it clear that the same check is
> being done which was the main thing.
>

OK. I will make the above changes in the next version.

Thanks.

Madhavan