[PATCH v2] x86/asm/entry: fix stack return address retrieval in thunk

From: Josh Poimboeuf
Date: Tue May 17 2016 - 14:06:15 EST


On Tue, May 17, 2016 at 09:31:12AM -0700, Linus Torvalds wrote:
> On Tue, May 17, 2016 at 7:43 AM, Josh Poimboeuf <jpoimboe@xxxxxxxxxx> wrote:
> >
> > index 98df1fa..dae7ca0 100644
> > --- a/arch/x86/entry/thunk_64.S
> > +++ b/arch/x86/entry/thunk_64.S
> > @@ -15,9 +15,10 @@
> > .globl \name
> > .type \name, @function
> > \name:
> > + /* push 1 register if frame pointers are enabled */
> > FRAME_BEGIN
> >
> > - /* this one pushes 9 elems, the next one would be %rIP */
> > + /* push 9 registers */
>
> I don't hate this patch, but quite frankly, as with the other case,
> I'd just make the frame pointer be unconditional in this case.
>
> If we push nine other registers, the frame pointer setup code is *not*
> going to matter.
>
> The reason to avoid frame pointers in code generation is two-fold:
>
> 1) for small leaf functions, it often ends up dominating
>
> 2) it removes a register that is otherwise usable, which can be
> particularly bad on 32-bit x86 due to the much more limited number of
> registers (and was apparently really noticeable on the older on-order
> atom cores)
>
> and in this case neither of them is really an issue.
>
> So I would suggest that any case that actually depends on a frame
> access just make the frame pointer not just unconditional, but
> _explicit_.
>
> So not just avoiding the macro because it's conditional, but write out
> the sequence to actually set up the frame, and then use
>
> - movq 9*8(%rsp), %rdi
> + movq 8(%rbp), %rdi # return address
>
> to entirely avoid all kind of "how many registers have we pushed" math.
>
> Considering that we got this wrong in two places, it's clearly too
> subtle for our little brains as-is.

Makes sense, thanks. Here's v2:

---

From: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
Subject: [PATCH v2] x86/asm/entry: fix stack return address retrieval in thunk

With CONFIG_FRAME_POINTER enabled, a thunk can pass a bad return address
value to the called function. '9*8(%rsp)' actually gets the frame
pointer, not the return address.

The only users of the 'put_ret_addr_in_rdi' option are two functions
which trace the enabling and disabling of interrupts, so this bug can
result in bad debug or tracing information with CONFIG_IRQSOFF_TRACER or
CONFIG_PROVE_LOCKING.

Fixes: 058fb73274f9 ("x86/asm/entry: Create stack frames in thunk functions")
Reported-by: Matt Fleming <matt@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Josh Poimboeuf <jpoimboe@xxxxxxxxxx>
---
arch/x86/entry/thunk_64.S | 11 +++++------
1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/arch/x86/entry/thunk_64.S b/arch/x86/entry/thunk_64.S
index 98df1fa..027aec4 100644
--- a/arch/x86/entry/thunk_64.S
+++ b/arch/x86/entry/thunk_64.S
@@ -8,16 +8,15 @@
#include <linux/linkage.h>
#include "calling.h"
#include <asm/asm.h>
-#include <asm/frame.h>

/* rdi: arg1 ... normal C conventions. rax is saved/restored. */
.macro THUNK name, func, put_ret_addr_in_rdi=0
.globl \name
.type \name, @function
\name:
- FRAME_BEGIN
+ pushq %rbp
+ movq %rsp, %rbp

- /* this one pushes 9 elems, the next one would be %rIP */
pushq %rdi
pushq %rsi
pushq %rdx
@@ -29,8 +28,8 @@
pushq %r11

.if \put_ret_addr_in_rdi
- /* 9*8(%rsp) is return addr on stack */
- movq 9*8(%rsp), %rdi
+ /* 8(%rbp) is return addr on stack */
+ movq 8(%rbp), %rdi
.endif

call \func
@@ -65,7 +64,7 @@ restore:
popq %rdx
popq %rsi
popq %rdi
- FRAME_END
+ popq %rbp
ret
_ASM_NOKPROBE(restore)
#endif
--
2.4.11