Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm"

From: Josh Poimboeuf
Date: Thu Jul 13 2017 - 14:00:28 EST


On Wed, Jul 12, 2017 at 04:22:13PM -0700, Matthias Kaehlcke wrote:
> El Wed, Jul 12, 2017 at 05:36:30PM -0500 Josh Poimboeuf ha dit:
>
> > On Wed, Jul 12, 2017 at 05:35:47PM -0500, Josh Poimboeuf wrote:
> > > On Wed, Jul 12, 2017 at 03:20:40PM -0700, Matthias Kaehlcke wrote:
> > > > > This is admittedly an awkward way of achieving this goal, but it's the
> > > > > only way I know how to do it with GCC.
> > > > >
> > > > > What extra instruction does clang add?
> > > >
> > > > I was looking at the get_user() call in drm_mode_setcrtc(). The code
> > > > generated by clang without the patch is:
> > > >
> > > > if (get_user(out_id, &set_connectors_ptr[i])) {
> > > > ffffffff81386955: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax
> > > > ffffffff8138695c: 00
> > > > ffffffff8138695d: 49 03 06 add (%r14),%rax
> > > > ffffffff81386960: e8 2b a5 f0 ff callq ffffffff81290e90 <__get_user_4>
> > > >
> > > > And with the patch:
> > > >
> > > > if (get_user(out_id, &set_connectors_ptr[i])) {
> > > > ffffffff81386a56: 4a 8d 04 bd 00 00 00 lea 0x0(,%r15,4),%rax
> > > > ffffffff81386a5d: 00
> > > > ffffffff81386a5e: 49 03 06 add (%r14),%rax
> > > > ffffffff81386a61: 48 8b 64 24 28 mov 0x28(%rsp),%rsp
> > > > ffffffff81386a66: e8 15 a5 f0 ff callq
> > > > ffffffff81290f80 <__get_user_4>
> > >
> > > Hm, that seems odd. Can you sure the disassembly for the whole
> > > function?
> >
> > Er, share :-)
>
> Sure, please find below the disassemblies with and without the
> patch. The exact extra instruction differs from the one above, the
> disassembly above is from a debug session with some 'random' kernel
> version (bisect), the ones below from a v4.12ish kernel. At the bottom
> you also find a log of a double faults observed with the patch.
>
> If you are interested in building the kernel with clang yourself I can
> provide instructions, it is fairly painless nowadays as long as you
> have a recent version of clang (a somewhat older version should also
> do for this issue with some extra kernel patches).

Here's the reason for the double fault. First it puts zero on the stack
at offset -0x58:

> ffffffff81367616: 31 c0 xor %eax,%eax
> ffffffff81367618: 48 89 45 c8 mov %rax,-0x38(%rbp)
> ffffffff8136761c: 45 31 ff xor %r15d,%r15d
> ffffffff8136761f: 48 89 45 a8 mov %rax,-0x58(%rbp)

Then, later, it copies that zeroed word from the stack to RSP:

> ffffffff81367874: 48 8b 65 a8 mov -0x58(%rbp),%rsp

Then it double faults because the call instruction tries to write RIP on
the stack, but RSP is zero:

> ffffffff81367878: e8 73 26 f1 ff callq ffffffff81279ef0 <__get_user_4>

Then clang tries to put RSP's value on the stack, at the same stack slot
where the original zero was stored (though it never reaches this point):

> ffffffff8136787d: 49 89 d4 mov %rdx,%r12
> ffffffff81367880: 48 89 65 a8 mov %rsp,-0x58(%rbp)

The panic is consistent with the above. RIP points to the call
instruction, RSP is zero:

> [ 3.798722] PANIC: double fault, error_code: 0x0
> [ 3.807387] CPU: 1 PID: 605 Comm: frecon Not tainted 4.12.0-00023-g711d82c128ff #107
> [ 3.816040] Hardware name: GOOGLE Squawks, BIOS Google_Squawks.5216.152.76 03/04/2016
> [ 3.824792] task: ffff880075b92f00 task.stack: ffffc90000d6c000
> [ 3.829599] EXT4-fs (mmcblk0p1): re-mounted. Opts: commit=600,data=ordered
> [ 3.839092] RIP: 0010:drm_mode_setcrtc+0x328/0x51f
> [ 3.844443] RSP: 0018:0000000000000000 EFLAGS: 00010206
> [ 3.850280] RAX: 0000559e707c4d60 RBX: 0000000000000000 RCX: 0000000000000008
> [ 3.858253] RDX: 0000000000000001 RSI: ffffc90000d6fcc8 RDI: ffffffff81367805
> [ 3.866225] RBP: ffffc90000d6fd90 R08: 00000000014000c0 R09: 0000000000000308
> [ 3.874199] R10: 0000000000000300 R11: 0000000000000556 R12: 0000000000000000
> [ 3.882163] R13: ffff880077a25000 R14: ffffc90000d6fdd0 R15: 0000000000000000
> [ 3.890136] FS: 00007fb2dbd62740(0000) GS:ffff88007ad00000(0000) knlGS:0000000000000000
> [ 3.899177] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 3.905595] CR2: fffffffffffffff8 CR3: 000000007596b000 CR4: 00000000001006e0
> [ 3.913568] Call Trace:
> [ 3.916296] Code: b8 48 8d b5 38 ff ff ff 41 8b 56 08 39 d3 0f 83 85 00 00 00 4c 63 fb 4a 83 24 f8 00 4a 8d 04 bd 00 00 00 00 49 03 06 48 8b 65 a8 <e8> 73 26 f1 ff 49 89 d4 48 89 65 a8 85 c0 0f 85 a0 00 00 00 ba
> [ 3.937440] Kernel panic - not syncing: Machine halted.
> [ 3.943268] CPU: 1 PID: 605 Comm: frecon Not tainted 4.12.0-00023-g711d82c128ff #107
> [ 3.951921] Hardware name: GOOGLE Squawks, BIOS Google_Squawks.5216.152.76 03/04/2016
> [ 3.960671] Call Trace:
> [ 3.963398] <#DF>
> [ 3.965637] __dump_stack+0x19/0x1b
> [ 3.969531] dump_stack+0x42/0x60

clang is obviously getting confused by the RSP output constraint. I
think it tries to take the constraint literally, since it takes RSP as
an output from the inline asm and stores it on the stack. However, that
behavior doesn't really make sense for a "register" variable. It also
doesn't explain why it's zeroing the register out first.

What happens if you try the below patch instead of the revert? Any
chance the offending instruction goes away?

diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 11433f9..beac907 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -171,7 +171,7 @@ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
might_fault(); \
asm volatile("call __get_user_%P4" \
: "=a" (__ret_gu), "=r" (__val_gu), "+r" (__sp) \
- : "0" (ptr), "i" (sizeof(*(ptr)))); \
+ : "0" (ptr), "i" (sizeof(*(ptr))), "r" (__sp)); \
(x) = (__force __typeof__(*(ptr))) __val_gu; \
__builtin_expect(__ret_gu, 0); \
})