RE: [PATCH v4 3/4] x86/uaccess: Use pointer masking to limit uaccess speculation

From: David Laight
Date: Wed May 05 2021 - 10:50:00 EST


From: Mark Rutland
> Sent: 05 May 2021 15:26
...
> > +/*
> > + * Sanitize a user pointer such that it becomes NULL if it's not a valid user
> > + * pointer. This prevents speculatively dereferencing a user-controlled
> > + * pointer to kernel space if access_ok() speculatively returns true. This
> > + * should be done *after* access_ok(), to avoid affecting error handling
> > + * behavior.
> > + */
> > +#define mask_user_ptr(ptr) \
> > +({ \
> > + unsigned long _ptr = (__force unsigned long)ptr; \
> > + unsigned long mask; \
> > + \
> > + asm volatile("cmp %[max], %[_ptr]\n\t" \
> > + "sbb %[mask], %[mask]\n\t" \
> > + : [mask] "=r" (mask) \
> > + : [_ptr] "r" (_ptr), \
> > + [max] "r" (TASK_SIZE_MAX) \
> > + : "cc"); \
> > + \
> > + mask &= _ptr; \
> > + ((typeof(ptr)) mask); \
> > +})
>
> On arm64 we needed to have a sequence here because the addr_limit used
> to be variable, but now that we've removed set_fs() and split the
> user/kernel access routines we could simplify that to an AND with an
> immediate mask to force all pointers into the user half of the address
> space. IIUC x86_64 could do the same, and I think that was roughly what
> David was suggesting.

Something like that :-)

For 64bit you can either unconditionally mask the user address
(to clear a few high bits) or mask with a calculated value
if the address is invalid.
The former is almost certainly better.

The other thing is that a valid length has to be less than
the TASK_SIZE_MAX.
Provided there are 2 zero bits at the top of every user address
you can check 'addr | size < limit' and know that 'addr + size'
won't wrap into kernel space.

32bit is more difficult.
User addresses (probably) go up to 0xc0000000 and the kernel
starts (almost) immediately.
If you never map a 4k page on one side of the boundary then
you only need to check the base address provided the user buffer
is less than 4k, or the accesses are guaranteed to be sequential.
While the full window test isn't that complicated ignoring the
length will remove some code - especially for hot paths that
use __get_user() to access a fixed size structure

> That does mean that you could still speculatively access user memory
> erroneously other than to NULL, but that's also true for speculated
> pointers below TASK_SIZE_MAX when using the more complex sequence.

True, but there are almost certainly easier ways to speculatively
access user addresses than passing a kernel alias of the address
into a system call!

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)