Re: [PATCH] x86, x86_64: Fix checks for userspace address limit

From: Brian Gerst
Date: Wed May 18 2011 - 00:48:34 EST


On Mon, May 16, 2011 at 7:42 AM, Ingo Molnar <mingo@xxxxxxx> wrote:
>
> * Jiri Olsa <jolsa@xxxxxxxxxx> wrote:
>
>> hi,
>> there seems to be bug in the _copy_to_user and _copy_from_user
>> functions, not allowing access to the last user page.
>>
>> Also I tried to decipher the inline assembly in __range_not_ok,
>> and it seems to work properly, but the macro comment seems to
>> be misleading.
>>
>> wbr,
>> jirka
>>
>> ---
>> As shown in BZ 30352 (https://bugzilla.kernel.org/show_bug.cgi?id=30352)
>> there's an issue with reading last allowed page on x86_64.
>>
>> The _copy_to_user and _copy_from_user functions use following
>> check for address limit:
>>
>> if (buf + size >= limit)
>> Â Â Â fail
>>
>> while it should be:
>>
>> if (buf + size > limit)
>> Â Â Â fail
>>
>> That's because the size represents the number of bytes being
>> read/write from/to buf address AND including the buf address.
>> So the copy function will actually never touch the limit
>> address even if "buf + size == limit".
>>
>> Following program fails to use the last page as buffer
>> due to the wrong limit check.
>>
>> ---
>> #include <sys/mman.h>
>> #include <sys/socket.h>
>> #include <assert.h>
>>
>> #define PAGE_SIZE Â Â Â (4096)
>> #define LAST_PAGE Â Â Â ((void*)(0x7fffffffe000))
>>
>> int main()
>> {
>> Â Â Â Â int fds[2], err;
>> Â Â Â Â void * ptr = mmap(LAST_PAGE, PAGE_SIZE, PROT_READ | PROT_WRITE,
>> Â Â Â Â Â Â Â Â Â Â Â Â Â MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED, -1, 0);
>> Â Â Â Â assert(ptr == LAST_PAGE);
>> Â Â Â Â err = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
>> Â Â Â Â assert(err == 0);
>> Â Â Â Â err = send(fds[0], ptr, PAGE_SIZE, 0);
>> Â Â Â Â perror("send");
>> Â Â Â Â assert(err == PAGE_SIZE);
>> Â Â Â Â err = recv(fds[1], ptr, PAGE_SIZE, MSG_WAITALL);
>> Â Â Â Â perror("recv");
>> Â Â Â Â assert(err == PAGE_SIZE);
>> Â Â Â Â return 0;
>> }
>> ---
>>
>> Other place checking the addr limit is access_ok function,
>> which is working properly. There's just misleading comment
>> for the __range_not_ok macro.
>>
>>
>> Signed-off-by: Jiri Olsa <jolsa@xxxxxxxxxx>
>> ---
>> Âarch/x86/include/asm/uaccess.h | Â Â2 +-
>> Âarch/x86/lib/copy_user_64.S Â Â| Â Â4 ++--
>> Â2 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
>> index abd3e0e..99f0ad7 100644
>> --- a/arch/x86/include/asm/uaccess.h
>> +++ b/arch/x86/include/asm/uaccess.h
>> @@ -42,7 +42,7 @@
>> Â * Returns 0 if the range is valid, nonzero otherwise.
>> Â *
>> Â * This is equivalent to the following test:
>> - * (u33)addr + (u33)size >= (u33)current->addr_limit.seg (u65 for x86_64)
>> + * (u33)addr + (u33)size > (u33)current->addr_limit.seg (u65 for x86_64)
>> Â *
>> Â * This needs 33-bit (65-bit for x86_64) arithmetic. We have a carry...
>> Â */
>> diff --git a/arch/x86/lib/copy_user_64.S b/arch/x86/lib/copy_user_64.S
>> index 99e4826..a73397f 100644
>> --- a/arch/x86/lib/copy_user_64.S
>> +++ b/arch/x86/lib/copy_user_64.S
>> @@ -72,7 +72,7 @@ ENTRY(_copy_to_user)
>> Â Â Â addq %rdx,%rcx
>> Â Â Â jc bad_to_user
>> Â Â Â cmpq TI_addr_limit(%rax),%rcx
>> - Â Â jae bad_to_user
>> + Â Â ja bad_to_user
>> Â Â Â ALTERNATIVE_JUMP X86_FEATURE_REP_GOOD,copy_user_generic_unrolled,copy_user_generic_string
>> Â Â Â CFI_ENDPROC
>> ÂENDPROC(_copy_to_user)
>> @@ -85,7 +85,7 @@ ENTRY(_copy_from_user)
>> Â Â Â addq %rdx,%rcx
>> Â Â Â jc bad_from_user
>> Â Â Â cmpq TI_addr_limit(%rax),%rcx
>> - Â Â jae bad_from_user
>> + Â Â ja bad_from_user
>> Â Â Â ALTERNATIVE_JUMP X86_FEATURE_REP_GOOD,copy_user_generic_unrolled,copy_user_generic_string
>> Â Â Â CFI_ENDPROC
>> ÂENDPROC(_copy_from_user)
>
> Hm, something tickles me about this area that we would reintroduce a security
> hole, that we really wanted to treat the last page of user-space as some sort
> of guard page but i cannot quite remember it why ...
>
> IIRC Linus wrote bits of this so i'm Cc:-ing him just in case he remembers.
>
> Thanks,
>
> Â Â Â ÂIngo

The guard page is apparently due to an erratum on K8 cpus (#121
Sequential Execution Across Non-Canonical Boundary
Causes Processor Hang). However, his test code is using the last
valid page before the guard page. The bug is that the last byte
before the guard page can't be read because of the off-by-one error.

--
Brian Gerst
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/