Re: [PATCH 3/3] rseq: optimise for 64bit arches
From: Mathieu Desnoyers
Date: Tue Apr 13 2021 - 10:21:54 EST
----- On Apr 13, 2021, at 6:36 AM, David Laight David.Laight@xxxxxxxxxx wrote:
> From: Peter Zijlstra
>> Sent: 13 April 2021 10:10
>>
>> On Tue, Apr 13, 2021 at 12:36:57AM -0700, Eric Dumazet wrote:
>> > From: Eric Dumazet <edumazet@xxxxxxxxxx>
>> >
>> > Commit ec9c82e03a74 ("rseq: uapi: Declare rseq_cs field as union,
>> > update includes") added regressions for our servers.
>> >
>> > Using copy_from_user() and clear_user() for 64bit values
>> > on 64bit arches is suboptimal.
>> >
>> > We might revisit this patch once all 32bit arches support
>> > get_user() and/or put_user() for 8 bytes values.
>>
>> Argh, what a mess :/ afaict only nios32 lacks put_user_8, but get_user_8
>> is missing in a fair number of archs.
>>
>> That said; 32bit archs never have to actually set the top bits in that
>> word, so they _could_ only set the low 32 bits. That works provided
>> userspace itself keeps the high bits clear.
>
> Does that work for 32bit BE ?
Yes, because uapi/linux/rseq.h defines the ptr32 as:
#if (defined(__BYTE_ORDER) && (__BYTE_ORDER == __BIG_ENDIAN)) || defined(__BIG_ENDIAN)
__u32 padding; /* Initialized to zero. */
__u32 ptr32;
#else /* LITTLE */
__u32 ptr32;
__u32 padding; /* Initialized to zero. */
#endif /* ENDIAN */
which takes care of BE vs LE.
>
> David
>
>> So I suppose that if we're going to #ifdef this, we might as well do the
>> whole thing.
>>
>> Mathieu; did I forget a reason why this cannot work?
The only difference it brings on 32-bit is that the truncation of high bits
will be done before the following validation:
if (!ptr) {
memset(rseq_cs, 0, sizeof(*rseq_cs));
return 0;
}
if (ptr >= TASK_SIZE)
return -EINVAL;
The question is whether we really want to issue a segmentation fault if 32-bit
user-space has set non-zero high bits, or if silently ignoring those high
bits is acceptable.
Nits below:
>>
>> diff --git a/kernel/rseq.c b/kernel/rseq.c
>> index a4f86a9d6937..94006190b8eb 100644
>> --- a/kernel/rseq.c
>> +++ b/kernel/rseq.c
>> @@ -115,20 +115,25 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t)
>> static int rseq_get_rseq_cs(struct task_struct *t, struct rseq_cs *rseq_cs)
>> {
>> struct rseq_cs __user *urseq_cs;
>> - u64 ptr;
>> + unsigned long ptr;
I am always reluctant to use long/unsigned long type as type for the get/put_user
(x) argument, because it hides the cast deep within architecture-specific macros.
I understand that in this specific case it happens that on 64-bit archs we end up
casting a u64 to unsigned long (same size), and on 32-bit archs we end up casting a
u32 to unsigned long (also same size), so there is no practical concern about type
promotion and sign-extension, but I think it would be better to have something
explicit, e.g.:
#ifdef CONFIG_64BIT
static int rseq_get_cs_ptr(struct rseq_cs __user **uptrp, struct rseq_cs *rseq_cs)
{
u64 ptr;
if (get_user(ptr, &rseq_cs->ptr64))
return -EFAULT;
*ptrp = (struct rseq_cs __user *)ptr;
return 0;
}
#else
static int rseq_get_cs_ptr(struct rseq_cs __user **uptrp, struct rseq_cs *rseq_cs)
{
u32 ptr;
if (get_user(ptr, &rseq_cs->ptr.ptr32))
return -EFAULT;
*ptrp = (struct rseq_cs __user *)ptr;
return 0;
}
#endif
And use those helpers to get the ptr value.
>> u32 __user *usig;
>> u32 sig;
>> int ret;
>>
>> - if (copy_from_user(&ptr, &t->rseq->rseq_cs.ptr64, sizeof(ptr)))
>> +#ifdef CONFIG_64BIT
>> + if (get_user(ptr, &t->rseq->rseq_cs.ptr64))
>> return -EFAULT;
>> +#else
>> + if (get_user(ptr, &t->rseq->rseq_cs.ptr32))
Note that this is also not right. It should be &t->rseq->rseq_cs.ptr.ptr32.
Thanks,
Mathieu
>> + return -EFAULT;
>> +#endif
>> if (!ptr) {
>> memset(rseq_cs, 0, sizeof(*rseq_cs));
>> return 0;
>> }
>> if (ptr >= TASK_SIZE)
>> return -EINVAL;
>> - urseq_cs = (struct rseq_cs __user *)(unsigned long)ptr;
>> + urseq_cs = (struct rseq_cs __user *)ptr;
>> if (copy_from_user(rseq_cs, urseq_cs, sizeof(*rseq_cs)))
>> return -EFAULT;
>>
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT,
> UK
> Registration No: 1397386 (Wales)
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com