Re: [RFC][PATCH 01/22] x86 user stack frame reads: switch to explicit __get_user()

From: Linus Torvalds
Date: Sun Mar 29 2020 - 14:16:42 EST


On Sun, Mar 29, 2020 at 11:03 AM David Laight <David.Laight@xxxxxxxxxx> wrote:
>
> > That's how get_user() already works.
> >
> > It is a polymorphic function (done using macros, sizeof() and ugly
> > compiler tricks) that generates a call, yes. But it's not a normal C
> > call. On x86-64, it returns the error code in %rax, and the value in
> > %rdx
>
> I must be mis-remembering the object code from last time
> I looked at it.

On an object code level, the end result actually almost looks like a
normal call, until you start looking at the exact register passing
details.

On a source level, it's anything but.

This is "get_user()" on x86:

#define get_user(x, ptr) \
({ \
int __ret_gu; \
register __inttype(*(ptr)) __val_gu asm("%"_ASM_DX); \
__chk_user_ptr(ptr); \
might_fault(); \
asm volatile("call __get_user_%P4" \
: "=a" (__ret_gu), "=r" (__val_gu), \
ASM_CALL_CONSTRAINT \
: "0" (ptr), "i" (sizeof(*(ptr)))); \
(x) = (__force __typeof__(*(ptr))) __val_gu; \
__builtin_expect(__ret_gu, 0); \
})

and all it actually *generates* is that single "call" instruction, so
the only code you see from the above mess in the object file (assuming
you don't have debugging enabled that expands "might_fault()" into a
lot of checking code) is something like this:

call __get_user_4

with the input address being in %rax (which is also the returning
error code), and the value of 'x' being in %rdx.

The above macro looks a bit messy, but that's actually hiding some of
the real complexity (that "__inttype()" thing is a mess of compiler
tricks in itself, as is the magical ASM_CALL_CONSTRAINT thing, along
with the magic that goes into a 64-bit return value on x86-32 which is
implicit in the magical register asm() definition).

The uaccess code is a lot of complex macros and inline functions to
make it all work well.

Al's uaccess cleanups (which should be in the next merge window) help
a _lot_. Not with the get_user() above (that is already about as
simple as it can get), but with a lot of the other crazy special cases
that we had grown over the years.

My current "unsafe_get_user() using clang and 'asm goto' with outpus"
patch is on top of Al's patches exactly because it was cleaner to do
with all of his cleanups.

But that magical asm-goto-with-outputs patch probably won't land
upstream for a couple of years.

I had the unsafe_put_user() patches to use 'asm goto' in a local
branch for almost three years before I made them official. See commit
a959dc88f9c8 ("Use __put_user_goto in __put_user_size() and
unsafe_put_user()") and notice how it has an author date of May, 2016,
but was only committed January 2019.

I basically waited until "asm goto" was something we could rely on
(and used for other reasons).

I suspect something very similar will happen with unsafe_get_user(). I
have a patch if people want to play with it, but right now it's a
"this relies on an unreleased version of clang to even work".

Linus