Re: [PATCH 1/2] x86/stacktrace: do not fail when regs on stack for ORC

From: Josh Poimboeuf
Date: Thu Nov 30 2017 - 14:57:16 EST


On Thu, Nov 30, 2017 at 09:03:24AM +0100, Jiri Slaby wrote:
> save_stack_trace_reliable now returns "non reliable" when there are
> kernel pt_regs on stack. This means an interrupt or exception happened.
> Somewhere down the route. It is a problem for frame pointer unwinder,
> because the frame might now have been set up yet when the irq happened,
> so it might fail to unwind from the interrupted function.
>
> With ORC, this is not a problem, as ORC has out-of-band data. We can
> find ORC data even for the IP in interrupted function and always unwind
> one level up.
>
> So introduce `unwind_regs_reliable' which decides if this is an issue
> for the currently selected unwinder at all and change the code
> accordingly.

Thanks. I'm thinking there a few ways we can simplify things. (Most of
these are actually issues with the existing code.)

- Currently we check to make sure that there's no frame *after* the user
space regs. I think there's no way that could ever happen and the
check is overkill.

- We should probably remove the STACKTRACE_DUMP_ONCE() warnings. There
are some known places where a stack trace will fail, particularly with
generated code. I wish we had a DEBUG_WARN_ON() macro which used
pr_debug(), but oh well. At least the livepatch code has some helpful
pr_warn()s, those are probably good enough.

- The unwind->error checks are superfluous. The only errors we need to
check for are (a) whether the FP unwinder encountered a kernel irq and
b) whether we reached the final user regs frame. So I think
unwind->error can be removed altogether.

So with those changes in mind, how about something like this (plus
comments)?

for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
unwind_next_frame(&state)) {

regs = unwind_get_entry_regs(&state);
if (regs) {
if (user_mode(regs))
goto success;

if (IS_ENABLED(CONFIG_FRAME_POINTER))
return -EINVAL;
}

addr = unwind_get_return_address(&state);
if (!addr)
return -EINVAL;

if (save_stack_address(trace, addr, false))
return -EINVAL;
}

return -EINVAL;

success:
if (trace->nr_entries < trace->max_entries)
trace->entries[trace->nr_entries++] = ULONG_MAX;

return 0;

After these changes I believe we can enable
CONFIG_HAVE_RELIABLE_STACKTRACE for ORC.

Also, when you post the next version, please cc the live patching
mailing list, since this is directly relevant to livepatch.

Thanks!

--
Josh