[PATCH v3 02/10] x86/entry/64: Initialize the top of the IRQ stack before switching stacks

From: Josh Poimboeuf
Date: Tue Jul 11 2017 - 11:34:09 EST

From: Andy Lutomirski <luto@xxxxxxxxxx>

The OOPS unwinder wants the word at the top of the IRQ stack to
point back to the previous stack at all times when the IRQ stack
is in use. There's currently a one-instruction window in ENTER_IRQ_STACK
during which this isn't the case. Fix it by writing the old RSP to the
top of the IRQ stack before jumping.

This currently writes the pointer to the stack twice, which is a bit
ugly. We could get rid of this by replacing irq_stack_ptr with
irq_stack_ptr_minus_eight (better name welcome). OTOH, there may be
all kinds of odd microarchitectural considerations in play that
affect performance by a few cycles here.

Reported-by: Mike Galbraith <efault@xxxxxx>
Reported-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>
Signed-off-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
arch/x86/entry/entry_64.S | 24 +++++++++++++++++++++++-
1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 0d4483a..b56f7f2 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -469,6 +469,7 @@ END(irq_entries_start)
movq %rsp, \old_rsp
incl PER_CPU_VAR(irq_count)
+ jnz .Lirq_stack_push_old_rsp_\@

* Right now, if we just incremented irq_count to zero, we've
@@ -478,9 +479,30 @@ END(irq_entries_start)
* it must be *extremely* careful to limit its stack usage. This
* could include kprobes and a hypothetical future IST-less #DB
* handler.
+ *
+ * The OOPS unwinder relies on the word at the top of the IRQ
+ * stack linking back to the previous RSP for the entire time we're
+ * on the IRQ stack. For this to work reliably, we need to write
+ * it before we actually move ourselves to the IRQ stack.
+ */
+ movq \old_rsp, PER_CPU_VAR(irq_stack_union + IRQ_STACK_SIZE - 8)
+ movq PER_CPU_VAR(irq_stack_ptr), %rsp
+ /*
+ * If the first movq above becomes wrong due to IRQ stack layout
+ * changes, the only way we'll notice is if we try to unwind right
+ * here. Assert that we set up the stack right to catch this type
+ * of bug quickly.
+ cmpq -8(%rsp), \old_rsp
+ je .Lirq_stack_okay\@
+ ud2
+ .Lirq_stack_okay\@:

- cmovzq PER_CPU_VAR(irq_stack_ptr), %rsp
pushq \old_rsp