Re: [RFC PATCH v7 0/4] arm64: Reorganize the unwinder and implement stack trace reliability checks

From: Madhavan T. Venkataraman
Date: Thu Aug 12 2021 - 14:45:28 EST


My mailer is screwing up.

I will resend the whole series as version 8 instead of version 7 to avoid further
confusion.

Thunderbird, sometimes! Again, I am so sorry.

Madhavan

On 8/12/21 1:31 PM, Madhavan T. Venkataraman wrote:
> The messages are not threaded properly.
>
> I will resend the whole series with proper threading.
>
> I apologize.
>
> Madhavan
>
> On 8/12/21 8:24 AM, madvenka@xxxxxxxxxxxxxxxxxxx wrote:
>> From: "Madhavan T. Venkataraman" <madvenka@xxxxxxxxxxxxxxxxxxx>
>>
>> Make all stack walking functions use arch_stack_walk()
>> ======================================================
>>
>> Currently, there are multiple functions in ARM64 code that walk the
>> stack using start_backtrace() and unwind_frame(). Convert all of
>> them to use arch_stack_walk(). This makes maintenance easier.
>>
>> Reorganize the unwinder code for better consistency and maintenance
>> ===================================================================
>>
>> Rename unwinder functions to unwind_*() similar to other architectures
>> for naming consistency.
>>
>> Annotate all of the unwind_*() functions with notrace so they cannot be
>> ftraced and NOKPROBE_SYMBOL() so they cannot be kprobed. Ftrace and Kprobe
>> code can call the unwinder.
>>
>> Redefine the unwinder loop and make it similar to other architectures.
>> Define the following:
>>
>> unwind_start(&frame, task, fp, pc);
>> while (unwind_consume(&frame, consume_entry, cookie))
>> unwind_next(&frame);
>> return !unwind_failed(&frame);
>>
>> unwind_start()
>> Same as the original start_backtrace().
>>
>> unwind_consume()
>> This new function does two things:
>>
>> - Calls consume_entry() to consume the return PC.
>>
>> - Implements checks to determine whether the unwind should continue
>> or terminate.
>>
>> unwind_next()
>> Same as the original unwind_frame() except:
>>
>> - the stack trace termination check has been moved from here to
>> unwind_consume(). So, unwind_next() assumes that the fp is valid.
>>
>> - unwind_frame() used to return an error value. This function only
>> sets internal state and does not return anything. The state is
>> retrieved via a helper. See next.
>>
>> unwind_failed()
>> Return a boolean to indicate whether the stack trace completed
>> successfully or failed. arch_stack_walk() ignores the return
>> value. But arch_stack_walk_reliable() in the future will look
>> at the return value.
>>
>> Unwind status
>> Introduce a new flag called "failed" in struct stackframe. Set this
>> flag when an error is encountered. If this flag is set, terminate
>> the unwind. Also, let the unwinder return the status to the caller.
>>
>> Reliability checks
>> ==================
>>
>> There are some kernel features and conditions that make a stack trace
>> unreliable. Callers may require the unwinder to detect these cases.
>> E.g., livepatch.
>>
>> Introduce a new function called unwind_is_reliable() that will detect
>> these cases and return a boolean.
>>
>> Introduce a new argument to unwind() called "need_reliable" so a caller
>> can tell unwind() that it requires a reliable stack trace. For such a
>> caller, any unreliability in the stack trace must be treated as a fatal
>> error and the unwind must be aborted.
>>
>> Call unwind_is_reliable() from unwind_consume() like this:
>>
>> if (frame->need_reliable && !unwind_is_reliable(frame)) {
>> frame->failed = true;
>> return false;
>> }
>>
>> arch_stack_walk() passes "false" for need_reliable because its callers
>> don't care about reliability. arch_stack_walk() is used for debug and
>> test purposes.
>>
>> Introduce arch_stack_walk_reliable() for ARM64. This works like
>> arch_stack_walk() except for two things:
>>
>> - It passes "true" for need_reliable.
>>
>> - It returns -EINVAL if unwind() aborts.
>>
>> Introduce the first reliability check in unwind_is_reliable() - If
>> a return PC is not a valid kernel text address, consider the stack
>> trace unreliable. It could be some generated code.
>>
>> Other reliability checks will be added in the future. Until all of the
>> checks are in place, arch_stack_walk_reliable() may not be used by
>> livepatch. But it may be used by debug and test code.
>>
>> SYM_CODE check
>> ==============
>>
>> SYM_CODE functions do not follow normal calling conventions. They cannot
>> be unwound reliably using the frame pointer. Collect the address ranges
>> of these functions in a special section called "sym_code_functions".
>>
>> In unwind_is_reliable(), check the return PC against these ranges. If a
>> match is found, then consider the stack trace unreliable. This is the
>> second reliability check introduced by this work.
>>
>> Last stack frame
>> ----------------
>>
>> If a SYM_CODE function occurs in the very last frame in the stack trace,
>> then the stack trace is not considered unreliable. This is because there
>> is no more unwinding to do. Examples:
>>
>> - EL0 exception stack traces end in the top level EL0 exception
>> handlers.
>>
>> - All kernel thread stack traces end in ret_from_fork().
>> ---
>> Changelog:
>>
>> v7:
>> From Mark Rutland:
>>
>> - Make the unwinder loop similar to other architectures.
>>
>> - Keep details to within the unwinder functions and return a simple
>> boolean to the caller.
>>
>> - Convert some of the current code that contains unwinder logic to
>> simply use arch_stack_walk(). I have converted all of them.
>>
>> - Do not copy sym_code_functions[]. Just place it in rodata for now.
>>
>> - Have the main loop check for termination conditions rather than
>> having unwind_frame() check for them. In other words, let
>> unwind_frame() assume that the fp is valid.
>>
>> - Replace the big comment for SYM_CODE functions with a shorter
>> comment.
>>
>> /*
>> * As SYM_CODE functions don't follow the usual calling
>> * conventions, we assume by default that any SYM_CODE function
>> * cannot be unwound reliably.
>> *
>> * Note that this includes:
>> *
>> * - Exception handlers and entry assembly
>> * - Trampoline assembly (e.g., ftrace, kprobes)
>> * - Hypervisor-related assembly
>> * - Hibernation-related assembly
>> * - CPU start-stop, suspend-resume assembly
>> * - Kernel relocation assembly
>> */
>>
>> v6:
>> From Mark Rutland:
>>
>> - The per-frame reliability concept and flag are acceptable. But more
>> work is needed to make the per-frame checks more accurate and more
>> complete. E.g., some code reorg is being worked on that will help.
>>
>> I have now removed the frame->reliable flag and deleted the whole
>> concept of per-frame status. This is orthogonal to this patch series.
>> Instead, I have improved the unwinder to return proper return codes
>> so a caller can take appropriate action without needing per-frame
>> status.
>>
>> - Remove the mention of PLTs and update the comment.
>>
>> I have replaced the comment above the call to __kernel_text_address()
>> with the comment suggested by Mark Rutland.
>>
>> Other comments:
>>
>> - Other comments on the per-frame stuff are not relevant because
>> that approach is not there anymore.
>>
>> v5:
>> From Keiya Nobuta:
>>
>> - The term blacklist(ed) is not to be used anymore. I have changed it
>> to unreliable. So, the function unwinder_blacklisted() has been
>> changed to unwinder_is_unreliable().
>>
>> From Mark Brown:
>>
>> - Add a comment for the "reliable" flag in struct stackframe. The
>> reliability attribute is not complete until all the checks are
>> in place. Added a comment above struct stackframe.
>>
>> - Include some of the comments in the cover letter in the actual
>> code so that we can compare it with the reliable stack trace
>> requirements document for completeness. I have added a comment:
>>
>> - above unwinder_is_unreliable() that lists the requirements
>> that are addressed by the function.
>>
>> - above the __kernel_text_address() call about all the cases
>> the call covers.
>>
>> v4:
>> From Mark Brown:
>>
>> - I was checking the return PC with __kernel_text_address() before
>> the Function Graph trace handling. Mark Brown felt that all the
>> reliability checks should be performed on the original return PC
>> once that is obtained. So, I have moved all the reliability checks
>> to after the Function Graph Trace handling code in the unwinder.
>> Basically, the unwinder should perform PC translations first (for
>> rhe return trampoline for Function Graph Tracing, Kretprobes, etc).
>> Then, the reliability checks should be applied to the resulting
>> PC.
>>
>> - Mark said to improve the naming of the new functions so they don't
>> collide with existing ones. I have used a prefix "unwinder_" for
>> all the new functions.
>>
>> From Josh Poimboeuf:
>>
>> - In the error scenarios in the unwinder, the reliable flag in the
>> stack frame should be set. Implemented this.
>>
>> - Some of the other comments are not relevant to the new code as
>> I have taken a different approach in the new code. That is why
>> I have not made those changes. E.g., Ard wanted me to add the
>> "const" keyword to the global section array. That array does not
>> exist in v4. Similarly, Mark Brown said to use ARRAY_SIZE() for
>> the same array in a for loop.
>>
>> Other changes:
>>
>> - Add a new definition for SYM_CODE_END() that adds the address
>> range of the function to a special section called
>> "sym_code_functions".
>>
>> - Include the new section under initdata in vmlinux.lds.S.
>>
>> - Define an early_initcall() to copy the contents of the
>> "sym_code_functions" section to an array by the same name.
>>
>> - Define a function unwinder_blacklisted() that compares a return
>> PC against sym_code_sections[]. If there is a match, mark the
>> stack trace unreliable. Call this from unwind_frame().
>>
>> v3:
>> - Implemented a sym_code_ranges[] array to contains sections bounds
>> for text sections that contain SYM_CODE_*() functions. The unwinder
>> checks each return PC against the sections. If it falls in any of
>> the sections, the stack trace is marked unreliable.
>>
>> - Moved SYM_CODE functions from .text and .init.text into a new
>> text section called ".code.text". Added this section to
>> vmlinux.lds.S and sym_code_ranges[].
>>
>> - Fixed the logic in the unwinder that handles Function Graph
>> Tracer return trampoline.
>>
>> - Removed all the previous code that handles:
>> - ftrace entry code for traced function
>> - special_functions[] array that lists individual functions
>> - kretprobe_trampoline() special case
>>
>> v2
>> - Removed the terminating entry { 0, 0 } in special_functions[]
>> and replaced it with the idiom { /* sentinel */ }.
>>
>> - Change the ftrace trampoline entry ftrace_graph_call in
>> special_functions[] to ftrace_call + 4 and added explanatory
>> comments.
>>
>> - Unnested #ifdefs in special_functions[] for FTRACE.
>>
>> v1
>> - Define a bool field in struct stackframe. This will indicate if
>> a stack trace is reliable.
>>
>> - Implement a special_functions[] array that will be populated
>> with special functions in which the stack trace is considered
>> unreliable.
>>
>> - Using kallsyms_lookup(), get the address ranges for the special
>> functions and record them.
>>
>> - Implement an is_reliable_function(pc). This function will check
>> if a given return PC falls in any of the special functions. If
>> it does, the stack trace is unreliable.
>>
>> - Implement check_reliability() function that will check if a
>> stack frame is reliable. Call is_reliable_function() from
>> check_reliability().
>>
>> - Before a return PC is checked against special_funtions[], it
>> must be validates as a proper kernel text address. Call
>> __kernel_text_address() from check_reliability().
>>
>> - Finally, call check_reliability() from unwind_frame() for
>> each stack frame.
>>
>> - Add EL1 exception handlers to special_functions[].
>>
>> el1_sync();
>> el1_irq();
>> el1_error();
>> el1_sync_invalid();
>> el1_irq_invalid();
>> el1_fiq_invalid();
>> el1_error_invalid();
>>
>> - The above functions are currently defined as LOCAL symbols.
>> Make them global so that they can be referenced from the
>> unwinder code.
>>
>> - Add FTRACE trampolines to special_functions[]:
>>
>> ftrace_graph_call()
>> ftrace_graph_caller()
>> return_to_handler()
>>
>> - Add the kretprobe trampoline to special functions[]:
>>
>> kretprobe_trampoline()
>>
>> Previous versions and discussion
>> ================================
>>
>> v6: https://lore.kernel.org/linux-arm-kernel/20210630223356.58714-1-madvenka@xxxxxxxxxxxxxxxxxxx/
>> v5: https://lore.kernel.org/linux-arm-kernel/20210526214917.20099-1-madvenka@xxxxxxxxxxxxxxxxxxx/
>> v4: https://lore.kernel.org/linux-arm-kernel/20210516040018.128105-1-madvenka@xxxxxxxxxxxxxxxxxxx/
>> v3: https://lore.kernel.org/linux-arm-kernel/20210503173615.21576-1-madvenka@xxxxxxxxxxxxxxxxxxx/
>> v2: https://lore.kernel.org/linux-arm-kernel/20210405204313.21346-1-madvenka@xxxxxxxxxxxxxxxxxxx/
>> v1: https://lore.kernel.org/linux-arm-kernel/20210330190955.13707-1-madvenka@xxxxxxxxxxxxxxxxxxx/
>> Madhavan T. Venkataraman (4):
>> arm64: Make all stack walking functions use arch_stack_walk()
>> arm64: Reorganize the unwinder code for better consistency and
>> maintenance
>> arm64: Introduce stack trace reliability checks in the unwinder
>> arm64: Create a list of SYM_CODE functions, check return PC against
>> list
>>
>> arch/arm64/include/asm/linkage.h | 12 ++
>> arch/arm64/include/asm/sections.h | 1 +
>> arch/arm64/include/asm/stacktrace.h | 16 +-
>> arch/arm64/kernel/perf_callchain.c | 5 +-
>> arch/arm64/kernel/process.c | 39 ++--
>> arch/arm64/kernel/return_address.c | 6 +-
>> arch/arm64/kernel/stacktrace.c | 291 ++++++++++++++++++++--------
>> arch/arm64/kernel/time.c | 22 ++-
>> arch/arm64/kernel/vmlinux.lds.S | 10 +
>> 9 files changed, 277 insertions(+), 125 deletions(-)
>>
>>
>> base-commit: 36a21d51725af2ce0700c6ebcb6b9594aac658a6
>>