Re: [PATCH v4 01/15] stacktrace/x86: add function for detecting reliable stack traces

From: Josh Poimboeuf
Date: Thu Jan 26 2017 - 12:57:53 EST


On Thu, Jan 26, 2017 at 02:56:03PM +0100, Petr Mladek wrote:
> On Thu 2017-01-19 09:46:09, Josh Poimboeuf wrote:
> > For live patching and possibly other use cases, a stack trace is only
> > useful if it can be assured that it's completely reliable. Add a new
> > save_stack_trace_tsk_reliable() function to achieve that.
>
> > diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
> > index 0653788..fc36842 100644
> > --- a/arch/x86/kernel/stacktrace.c
> > +++ b/arch/x86/kernel/stacktrace.c
> > @@ -74,6 +74,90 @@ void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
> > }
> > EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
> >
> > +#ifdef CONFIG_HAVE_RELIABLE_STACKTRACE
> > +static int __save_stack_trace_reliable(struct stack_trace *trace,
> > + struct task_struct *task)
> > +{
> > + struct unwind_state state;
> > + struct pt_regs *regs;
> > + unsigned long addr;
> > +
> > + for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
> > + unwind_next_frame(&state)) {
> > +
> > + regs = unwind_get_entry_regs(&state);
> > + if (regs) {
> > + /*
> > + * Kernel mode registers on the stack indicate an
> > + * in-kernel interrupt or exception (e.g., preemption
> > + * or a page fault), which can make frame pointers
> > + * unreliable.
> > + */
> > + if (!user_mode(regs))
> > + return -1;
> > +
> > + /*
> > + * The last frame contains the user mode syscall
> > + * pt_regs. Skip it and finish the unwind.
> > + */
> > + unwind_next_frame(&state);
> > + if (WARN_ON_ONCE(!unwind_done(&state))) {
> > + show_stack(task, NULL);
>
> We should make sure that show_stack() is called only once as well.
> Otherwise, it would fill logbuffer with random stacktraces without
> any context. It might easily cause flood of messages and the first
> useful one might get lost in the ring buffer.

Agreed.

> > + return -1;
> > + }
> > + break;
> > + }
> > +
> > + addr = unwind_get_return_address(&state);
> > +
> > + /*
> > + * A NULL or invalid return address probably means there's some
> > + * generated code which __kernel_text_address() doesn't know
> > + * about.
> > + */
> > + if (WARN_ON_ONCE(!addr)) {
> > + show_stack(task, NULL);
>
> Same here.
>
> > + return -1;
> > + }
> > +
> > + if (save_stack_address(trace, addr, false))
> > + return -1;
> > + }
> > +
> > + /* Check for stack corruption */
> > + if (WARN_ON_ONCE(unwind_error(&state))) {
> > + show_stack(task, NULL);
>
> And here.
>
> > + return -1;
> > + }
> > +
> > + if (trace->nr_entries < trace->max_entries)
> > + trace->entries[trace->nr_entries++] = ULONG_MAX;
> > +
> > + return 0;
> > +}
> > +
> > +/*
> > + * This function returns an error if it detects any unreliable features of the
> > + * stack. Otherwise it guarantees that the stack trace is reliable.
> > + *
> > + * If the task is not 'current', the caller *must* ensure the task is inactive.
> > + */
> > +int save_stack_trace_tsk_reliable(struct task_struct *tsk,
> > + struct stack_trace *trace)
> > +{
> > + int ret;
> > +
> > + if (!try_get_task_stack(tsk))
> > + return -EINVAL;
> > +
> > + ret = __save_stack_trace_reliable(trace, tsk);
>
> __save_stack_trace_reliable() returns -1 in case of problems.
> But this function returns a meaningful error codes, line -EINVAL,
> -ENOSYS, otherwise.
>
> We should either transform the error code here to something
> "meaningful", probably -EINVAL. Or we should update
> __save_stack_trace_reliable() to return meaningful error codes.

Agreed.

> > + put_task_stack(tsk);
> > +
> > + return ret;
> > +}
> > +#endif /* CONFIG_HAVE_RELIABLE_STACKTRACE */
> > +
> > /* Userspace stacktrace - based on kernel/trace/trace_sysprof.c */
> >
> > struct stack_frame_user {
>
> Otherwise, all the logic looks fine to me. Great work!

Thanks!

--
Josh