Re: [PATCH v6 0/4] x86, kasan: add KASAN checks to atomic operations
From: Dmitry Vyukov
Date: Tue Jan 30 2018 - 04:27:40 EST
On Tue, Jan 30, 2018 at 10:23 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> On Mon, Jan 29, 2018 at 6:26 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>> KASAN uses compiler instrumentation to intercept all memory accesses.
>> But it does not see memory accesses done in assembly code.
>> One notable user of assembly code is atomic operations. Frequently,
>> for example, an atomic reference decrement is the last access to an
>> object and a good candidate for a racy use-after-free.
>>
>> Atomic operations are defined in arch files, but KASAN instrumentation
>> is required for several archs that support KASAN. Later we will need
>> similar hooks for KMSAN (uninit use detector) and KTSAN (data race
>> detector).
>>
>> This change introduces wrappers around atomic operations that can be
>> used to add KASAN/KMSAN/KTSAN instrumentation across several archs,
>> and adds KASAN checks to them.
>>
>> This patch uses the wrappers only for x86 arch. Arm64 will be switched
>> later. And we also plan to instrument bitops in a similar way.
>>
>> Within a day it has found its first bug:
>>
>> BUG: KASAN: use-after-free in atomic_dec_and_test
>> arch/x86/include/asm/atomic.h:123 [inline] at addr ffff880079c30158
>> Write of size 4 by task syz-executor6/25698
>> CPU: 2 PID: 25698 Comm: syz-executor6 Not tainted 4.10.0+ #302
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
>> Call Trace:
>> kasan_check_write+0x14/0x20 mm/kasan/kasan.c:344
>> atomic_dec_and_test arch/x86/include/asm/atomic.h:123 [inline]
>> put_task_struct include/linux/sched/task.h:93 [inline]
>> put_ctx+0xcf/0x110 kernel/events/core.c:1131
>> perf_event_release_kernel+0x3ad/0xc90 kernel/events/core.c:4322
>> perf_release+0x37/0x50 kernel/events/core.c:4338
>> __fput+0x332/0x800 fs/file_table.c:209
>> ____fput+0x15/0x20 fs/file_table.c:245
>> task_work_run+0x197/0x260 kernel/task_work.c:116
>> exit_task_work include/linux/task_work.h:21 [inline]
>> do_exit+0xb38/0x29c0 kernel/exit.c:880
>> do_group_exit+0x149/0x420 kernel/exit.c:984
>> get_signal+0x7e0/0x1820 kernel/signal.c:2318
>> do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:808
>> exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:157
>> syscall_return_slowpath arch/x86/entry/common.c:191 [inline]
>> do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:286
>> entry_SYSCALL64_slow_path+0x25/0x25
>> RIP: 0033:0x4458d9
>> RSP: 002b:00007f3f07187cf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
>> RAX: fffffffffffffe00 RBX: 00000000007080c8 RCX: 00000000004458d9
>> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000007080c8
>> RBP: 00000000007080a8 R08: 0000000000000000 R09: 0000000000000000
>> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
>> R13: 0000000000000000 R14: 00007f3f071889c0 R15: 00007f3f07188700
>> Object at ffff880079c30140, in cache task_struct size: 5376
>> Allocated:
>> PID = 25681
>> kmem_cache_alloc_node+0x122/0x6f0 mm/slab.c:3662
>> alloc_task_struct_node kernel/fork.c:153 [inline]
>> dup_task_struct kernel/fork.c:495 [inline]
>> copy_process.part.38+0x19c8/0x4aa0 kernel/fork.c:1560
>> copy_process kernel/fork.c:1531 [inline]
>> _do_fork+0x200/0x1010 kernel/fork.c:1994
>> SYSC_clone kernel/fork.c:2104 [inline]
>> SyS_clone+0x37/0x50 kernel/fork.c:2098
>> do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
>> return_from_SYSCALL_64+0x0/0x7a
>> Freed:
>> PID = 25681
>> __cache_free mm/slab.c:3514 [inline]
>> kmem_cache_free+0x71/0x240 mm/slab.c:3774
>> free_task_struct kernel/fork.c:158 [inline]
>> free_task+0x151/0x1d0 kernel/fork.c:370
>> copy_process.part.38+0x18e5/0x4aa0 kernel/fork.c:1931
>> copy_process kernel/fork.c:1531 [inline]
>> _do_fork+0x200/0x1010 kernel/fork.c:1994
>> SYSC_clone kernel/fork.c:2104 [inline]
>> SyS_clone+0x37/0x50 kernel/fork.c:2098
>> do_syscall_64+0x2e8/0x930 arch/x86/entry/common.c:281
>> return_from_SYSCALL_64+0x0/0x7a
>>
>> Changes since v1:
>> - dropped "x86: remove unused atomic_inc_short()" patch
>> it is mailed separately
>> - rebased on top of tip/locking/core head
>> - other changes noted within individual patches
>>
>> Changes since v2:
>> - rebased on top of tip/locking/core head
>> - dropped a pervasive "x86: use long long for 64-bit atomic ops" commit,
>> instead use s64 type in wrappers
>> - added "x86: use s64* for old arg of atomic64_try_cmpxchg()" commit
>>
>> Changes since v3 are noted in individual commits.
>>
>> Changes since v4:
>> - rebased on tip/locking/core HEAD
>>
>> Changes since v5:
>> - rework cmpxchg* implementations so that we have less
>> code in macros and more code in functions
>
> Some context.
> This revives a half-year old patch. v5 of this was applied to
> tip/locking/core, but then reverted due to a reported crash:
> https://groups.google.com/d/msg/kasan-dev/ZJl66N7smmk/lJY99HmmAgAJ
>
> The root cause was in the cmpxchg macros:
>
> #define cmpxchg64(ptr, old, new) \
> ({ \
> __typeof__(ptr) ____ptr = (ptr); \
> kasan_check_write(____ptr, sizeof(*____ptr)); \
> arch_cmpxchg64(____ptr, (old), (new)); \
> })
>
> I had to introduce the new ____ptr variable, so that
> kasan_check_write() and arch_cmpxchg64() don't evaluate ptr twice. But
> there are multiple layers of macros and ____ptr ended up referring to
> something else (to itself).
>
> v6 changes cmpxchg macros to (as was suggested by Mark and Thomas):
>
> static __always_inline unsigned long
> cmpxchg_size(volatile void *ptr, unsigned long old, unsigned long new, int size)
> {
> kasan_check_write(ptr, size);
> switch (size) {
> case 1:
> return arch_cmpxchg((u8 *)ptr, (u8)old, (u8)new);
> case 2:
> return arch_cmpxchg((u16 *)ptr, (u16)old, (u16)new);
> case 4:
> return arch_cmpxchg((u32 *)ptr, (u32)old, (u32)new);
> case 8:
> BUILD_BUG_ON(sizeof(unsigned long) != 8);
> return arch_cmpxchg((u64 *)ptr, (u64)old, (u64)new);
> }
> BUILD_BUG();
> return 0;
> }
>
> #define cmpxchg(ptr, old, new) \
> ({ \
> ((__typeof__(*(ptr)))cmpxchg_size((ptr), (unsigned long)(old), \
> (unsigned long)(new), sizeof(*(ptr)))); \
> })
>
> Otherwise the patch series were rebased without any conflicts (surprisingly).
>
> There is now some duplication between cmpxchg_size, cmpxchg_local_size
> and sync_cmpxchg_size, but I thought that employing more macros while
> trying to resolve macro mess bugs is not the best idea.
Why I am reviving this patch now: besides the fact that it needs to be
done sooner or later anyway (also useful for KMSAN/KTSAN), now there
is another reason. Any silent heap corruptions produce very negative
effect on syzbot. If we miss a corruption on atomic access, syzbot can
report a dozen of induced crashes in random places. These look broken,
unexplainable, non-reproducible and cause complaints. It's better to
proactively prevent at this this known source of silent corruptions.