Re: [PATCH] Revert "x86/uaccess: Add stack frame output operand in get_user() inline asm"
From: Josh Poimboeuf
Date: Thu Jul 20 2017 - 16:57:08 EST
On Thu, Jul 20, 2017 at 06:30:24PM +0300, Andrey Ryabinin wrote:
> FWIW bellow is my understanding of what's going on.
>
> It seems clang treats local named register almost the same as ordinary
> local variables.
> The only difference is that before reading the register variable clang
> puts variable's value into the specified register.
>
> So clang just assigns stack slot for the variable __sp where it's
> going to keep variable's value.
> But since __sp is unitialized (we haven't assign anything to it), the
> value of the __sp is some garbage from stack.
> inline asm specifies __sp as input, so clang assumes that it have to
> load __sp into 'rsp' because inline asm is going to use
> it. And it just loads garbage from stack into 'rsp'
>
> In fact, such behavior (I mean storing the value on stack and loading
> into reg before the use) is very useful.
> Clang's behavior allows to keep the value assigned to the
> call-clobbered register across the function calls.
>
> Unlike clang, gcc assigns value to the register right away and doesn't
> store the value anywhere else. So if the reg is
> call clobbered register you have to be absolutely sure that there is
> no subsequent function call that might clobber the register.
>
> E.g. see some real examples
> https://patchwork.kernel.org/patch/4111971/ or 98d4ded60bda("msm: scm:
> Fix improper register assignment").
> These bugs shouldn't happen with clang.
>
> But the global named register works slightly differently in clang. For
> the global, the value is just the value of the register itself,
> whatever it is. Read/write from global named register is just like
> direct read/write to the register
Thanks, that clears up a lot of the confusion for me.
Still, unfortunately, I don't think that's going to work for GCC.
Changing the '__sp' register variable to global in the header file
causes it to make a *bunch* of changes across the kernel, even in
functions which don't do inline asm. It seems to be disabling some
optimizations across the board.
I do have another idea, which is to replace all uses of
asm(" ... call foo ... " : outputs : inputs : clobbers);
with a new ASM_CALL macro:
ASM_CALL(" ... call foo ... ", outputs, inputs, clobbers);
Then the compiler differences can be abstracted out, with GCC adding
"sp" as an output constraint and clang doing nothing (for now).
--
Josh