Re: [PATCH v2 6/8] x86/entry: add unwind hint annotations

From: Josh Poimboeuf
Date: Thu Jun 29 2017 - 17:41:41 EST


On Thu, Jun 29, 2017 at 02:09:54PM -0700, Andy Lutomirski wrote:
> On Thu, Jun 29, 2017 at 12:05 PM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> > On Thu, Jun 29, 2017 at 11:50:18AM -0700, Andy Lutomirski wrote:
> >> On Thu, Jun 29, 2017 at 10:53 AM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> >> > There's a bug here that will need a small change to the entry code.
> >> >
> >> > Mike Galbraith reported:
> >> >
> >> > WARNING: can't dereference registers at ffffc900089d7e08 for ip ffffffff81740bbb
> >> >
> >> > After some looking I found that it's caused by the following code
> >> > snippet in the 'interrupt' macro in entry_64.S:
> >> >
> >> > /*
> >> > * Save previous stack pointer, optionally switch to interrupt stack.
> >> > * irq_count is used to check if a CPU is already on an interrupt stack
> >> > * or not. While this is essentially redundant with preempt_count it is
> >> > * a little cheaper to use a separate counter in the PDA (short of
> >> > * moving irq_enter into assembly, which would be too much work)
> >> > */
> >> > movq %rsp, %rdi
> >> > incl PER_CPU_VAR(irq_count)
> >> > cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
> >> > UNWIND_HINT_REGS base=rdi
> >> > pushq %rdi
> >> > UNWIND_HINT_REGS indirect=1
> >> >
> >> > The problem is that it's changing the stack pointer *before* writing the
> >> > previous stack pointer (push %rdi). So when unwinding from an NMI which
> >> > hit between the rsp write and the rdi push, the unwinder tries to access
> >> > the regs on the previous stack (by reading rdi), but the previous stack
> >> > pointer isn't there yet, so the access is considered out of bounds.
> >>
> >> Ugh, that code. Does this problem go away with this patch applied:
> >>
> >> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/entry_ist&id=2231ec7e0bcc1a2bc94a17081511ab54cc6badd1
> >>
> >> If so, want to update the patch for new kernels (shouldn't conflict
> >> with anything except your unwind hints)?
> >
> > I don't think that patch will fix it, because it still updates rsp
> > *before* writing the old rsp on the new stack. So there's still a
> > window where the "previous stack" pointer is missing.
>
> But it's in a register. Is undwarf not able to grok that?

Sorry, I didn't explain it very well. Undwarf can find the regs pointer
in rdi, it just doesn't trust its value.

See the stack_info.next_sp field, which is set in in_irq_stack():

/*
* The next stack pointer is the first thing pushed by the entry code
* after switching to the irq stack.
*/
info->next_sp = (unsigned long *)*(end - 1);

It's a safety mechanism. The unwinder needs the last word of the irq
stack page to point to the previous stack. That way it can double check
that the stack pointer it calculates is within the bounds of either the
current stack or the previous stack.

In the above code, the previous stack pointer (or next stack pointer,
depending on your perspective) hasn't been set up before it switches
stacks. So the unwinder reads an uninitialized value into
info->next_sp, and compares that with the regs pointer, and then stops
the unwind because it thinks it went off into the weeds.

--
Josh