Re: [PATCH] arch/x86: fix out-of-bounds in get_wchan()

From: Dmitry Vyukov
Date: Mon Sep 28 2015 - 05:49:44 EST


On Mon, Sep 28, 2015 at 11:37 AM, Borislav Petkov <bp@xxxxxxxxx> wrote:
> On Mon, Sep 28, 2015 at 11:00:39AM +0200, Dmitry Vyukov wrote:
>> get_wchan() checks that fp is within stack bounds,
>> but then dereferences fp+8. This can crash kernel
>> or leak sensitive information. Also the function
>> operates on a potentially running stack, but does
>> not use READ_ONCE. As the result it can check that
>> one value is within stack bounds, but then deref
>> another value.
>>
>> Fix the bounds check and use READ_ONCE for all
>> volatile data.
>>
>> The bug was discovered with KASAN.
>>
>> Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
>> ---
>> FTR, here is the KASAN report:
>>
>> [ 124.575597] ERROR: AddressSanitizer: heap-buffer-overflow on address ffff88002e280000
>> [ 124.578633] Accessed by thread T10915:
>> [ 124.581050] #2 ffffffff810dd423 in __tsan_read8 ??:0
>> [ 124.581893] #3 ffffffff8107c093 in get_wchan ./arch/x86/kernel/process_64.c:444
>> [ 124.582763] #4 ffffffff81342108 in do_task_stat array.c:0
>> [ 124.583634] #5 ffffffff81342dcc in proc_tgid_stat ??:0
>> [ 124.584548] #6 ffffffff8133c984 in proc_single_show base.c:0
>> [ 124.585461] #7 ffffffff812d18cc in seq_read ./fs/seq_file.c:222
>> [ 124.586313] #8 ffffffff8129e503 in vfs_read ??:0
>> [ 124.587137] #9 ffffffff8129f800 in SyS_read ??:0
>> [ 124.587827] #10 ffffffff81929bf5 in sysenter_dispatch ./arch/x86/ia32/ia32entry.S:164
>> [ 124.588738]
>> [ 124.593434] Shadow bytes around the buggy address:
>> [ 124.594270] ffff88002e27fd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [ 124.595339] ffff88002e27fe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [ 124.596453] ffff88002e27fe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [ 124.597466] ffff88002e27ff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [ 124.598501] ffff88002e27ff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [ 124.599629] =>ffff88002e280000:[fa]fa fa fa fa fa fa fa fa fa 00 00 00 00 00 00
>> [ 124.600873] ffff88002e280080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> [ 124.601892] ffff88002e280100: 00 fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>> [ 124.603037] ffff88002e280180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
>> [ 124.604047] ffff88002e280200: fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd fd
>> [ 124.605054] ffff88002e280280: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa fa
>> [ 124.605993] Shadow byte legend (one shadow byte represents 8 application bytes):
>> [ 124.606958] Addressable: 00
>> [ 124.607483] Partially addressable: 01 02 03 04 05 06 07
>> [ 124.608219] Heap redzone: fa
>> [ 124.608724] Heap kmalloc redzone: fb
>> [ 124.609249] Freed heap region: fd
>> [ 124.609753] Shadow gap:fe
>> [ 124.610292] =========================================================================
>> ---
>> arch/x86/kernel/process_64.c | 12 +++++++-----
>> 1 file changed, 7 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c
>> index 71d7849..a1fce34 100644
>> --- a/arch/x86/kernel/process_64.c
>> +++ b/arch/x86/kernel/process_64.c
>> @@ -506,17 +506,19 @@ unsigned long get_wchan(struct task_struct *p)
>> if (!p || p == current || p->state == TASK_RUNNING)
>> return 0;
>> stack = (unsigned long)task_stack_page(p);
>> - if (p->thread.sp < stack || p->thread.sp >= stack+THREAD_SIZE)
>> + /* The task can be already running at this point, so tread carefully. */
>> + fp = READ_ONCE(p->thread.sp);
>> + if (fp < stack || fp >= stack+THREAD_SIZE)
>> return 0;
>> - fp = *(u64 *)(p->thread.sp);
>> + fp = READ_ONCE(*(u64 *)fp);
>
> Why isn't this:
>
> fp = READ_ONCE(*(u64 *)p->thread.sp);
>
> like the original code did?


Original code did:

if (p->thread.sp < stack || p->thread.sp >= stack+THREAD_SIZE)
return 0;
fp = *(u64 *)(p->thread.sp);

p->thread.sp can change concurrently.
So we could check that p->thread.sp is within stack bounds, but then
dereference another value (which is already outside of bounds).




> Actually, the original code looks fishy to me too - it did access live
> stack three times. And shouldn't we be accessing it only once?
>
> I.e.,
>
> fp_st = READ_ONCE(p->thread.sp);
> if (fp_st < stack || fp_st >= stack + THREAD_SIZE)
> return 0;
> fp = *(u64 *)fp_st;
>
> Hmm?

That's what my patch does.


> Maybe I'm not completely clear on how the whole locking happens here
> because we do
>
> if (!p || p == current || p->state == TASK_RUNNING)
> return 0;
>
> earlier but apparently we can become TASK_RUNNING after the check...
>
> Also, shouldn't this one have a CVE number assigned or so due to the
> leakage potential?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/