KASAN: use-after-free Read in link_path_walk

From: Dae R. Jeong
Date: Mon Jul 23 2018 - 23:45:50 EST


Reporting the crash: KASAN: use-after-free Read in link_path_walk

This crash has been found in v4.17-rc1 using RaceFuzzer (a modified
version of Syzkaller), which we describe more at the end of this
report. Our analysis shows that the race occurs when invoking two
syscalls concurrently, open() and chroot().

Diagnosis:
We think that it is possible that link_path_walk() dereferences a
freed pointer when cleanup_mnt() is executed between path_init() and
link_path_walk().

Since I'm not an expert on a file system and don't fully understand
the crash, please see a executed program and a crash log below in
case that my understanding is wrong.


Executed Program:
Thread0 Thread1
mkdir("./file0")
|--------------------------|
| mount("./file0", "./file0", "devpts", 0x0, "")
| |
openat(AT_FDCWD, chroot("./file0")
"/dev/vcs", 0x200, 0x0) umount("./file0", 0x2)

openat(), chroot(), umount() syscalls are executed after mount() syscall.
We think a race occurs between openat() and chroot() because RaceFuzzer
executed openat() and chroot() concurrently.


(Possible) Thread interleaving:
CPU0 (path_openat) CPU1 (cleanup_mnt)
===== =====
s = path_init(nd, flags);
if (IS_ERR(s)) {
put_filp(file);
return ERR_CAST(s);
}

deactivate_super(mnt->mnt.mnt_sb);

while (!(error = link_path_walk(s, nd)) &&

// (in link_path_walk())
struct dentry *parent = nd->path.dentry;
nd->flags &= ~LOOKUP_JUMPED;
if (unlikely(parent->d_flags & DCACHE_OP_HASH)) { // UAF occured


Crash log:
==================================================================
BUG: KASAN: use-after-free in link_path_walk+0x46e/0xcd0 fs/namei.c:2061
Read of size 4 at addr ffff8801cbe6cb80 by task syz-executor0/28699

CPU: 0 PID: 28699 Comm: syz-executor0 Not tainted 4.17.0-rc1 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x166/0x21c lib/dump_stack.c:113
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:354 [inline]
kasan_report+0x23f/0x360 mm/kasan/report.c:412
check_memory_region_inline mm/kasan/kasan.c:260 [inline]
__asan_load4+0x78/0x80 mm/kasan/kasan.c:698
link_path_walk+0x46e/0xcd0 fs/namei.c:2061
path_openat+0x23c/0x2040 fs/namei.c:3500
do_filp_open+0x175/0x230 fs/namei.c:3535
do_sys_open+0x3c7/0x4a0 fs/open.c:1093
__do_sys_open fs/open.c:1111 [inline]
__se_sys_open fs/open.c:1106 [inline]
__x64_sys_open+0x4c/0x60 fs/open.c:1106
do_syscall_64+0x15f/0x4a0 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x410601
RSP: 002b:00007f7345489660 EFLAGS: 00000293 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: cccccccccccccccd RCX: 0000000000410601
RDX: 0000000000000000 RSI: 0000000000010180 RDI: 00007f7345489710
RBP: 00000000000006e1 R08: 236573756f6d2f74 R09: 0000000000000000
R10: 00000000200004c0 R11: 0000000000000293 R12: 00007f734548a6d4
R13: 00000000ffffffff R14: 00000000006ff5b8 R15: 0000000000000000

Allocated by task 28699:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
kasan_kmalloc+0xae/0xe0 mm/kasan/kasan.c:553
kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554
__d_alloc+0xc0/0x6e0 fs/dcache.c:1638
d_alloc_anon fs/dcache.c:1742 [inline]
d_make_root+0x2d/0x70 fs/dcache.c:1934
devpts_fill_super+0x23b/0x500 fs/devpts/inode.c:482
mount_nodev+0x59/0xd0 fs/super.c:1211
devpts_mount+0x2c/0x40 fs/devpts/inode.c:509
mount_fs+0x50/0x200 fs/super.c:1268
vfs_kern_mount.part.26+0xbc/0x2c0 fs/namespace.c:1037
vfs_kern_mount fs/namespace.c:2514 [inline]
do_new_mount fs/namespace.c:2517 [inline]
do_mount+0xb82/0x1bb0 fs/namespace.c:2847
ksys_mount+0xab/0x120 fs/namespace.c:3063
__do_sys_mount fs/namespace.c:3077 [inline]
__se_sys_mount fs/namespace.c:3074 [inline]
__x64_sys_mount+0x67/0x80 fs/namespace.c:3074
do_syscall_64+0x15f/0x4a0 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe

Freed by task 28700:
save_stack+0x43/0xd0 mm/kasan/kasan.c:448
set_track mm/kasan/kasan.c:460 [inline]
__kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
__cache_free mm/slab.c:3498 [inline]
kmem_cache_free+0x83/0x2a0 mm/slab.c:3756
__d_free fs/dcache.c:257 [inline]
dentry_free+0x8c/0xe0 fs/dcache.c:347
__dentry_kill+0x3d6/0x440 fs/dcache.c:582
dentry_kill+0x8f/0x320 fs/dcache.c:686
dput.part.22+0x430/0x4e0 fs/dcache.c:850
dput fs/dcache.c:830 [inline]
do_one_tree+0x43/0x50 fs/dcache.c:1523
shrink_dcache_for_umount+0xa5/0x1c0 fs/dcache.c:1537
generic_shutdown_super+0xb0/0x330 fs/super.c:425
kill_anon_super fs/super.c:1037 [inline]
kill_litter_super+0x48/0x60 fs/super.c:1047
devpts_kill_sb+0x49/0x50 fs/devpts/inode.c:519
deactivate_locked_super+0x71/0xb0 fs/super.c:313
deactivate_super+0x10f/0x150 fs/super.c:344
cleanup_mnt+0x6b/0xc0 fs/namespace.c:1173
__cleanup_mnt+0x16/0x20 fs/namespace.c:1180
task_work_run+0x152/0x1b0 kernel/task_work.c:113
tracehook_notify_resume include/linux/tracehook.h:191 [inline]
exit_to_usermode_loop+0x262/0x270 arch/x86/entry/common.c:166
prepare_exit_to_usermode arch/x86/entry/common.c:196 [inline]
syscall_return_slowpath arch/x86/entry/common.c:265 [inline]
do_syscall_64+0x473/0x4a0 arch/x86/entry/common.c:290
entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff8801cbe6cb80
which belongs to the cache dentry(17:syz0) of size 288
The buggy address is located 0 bytes inside of
288-byte region [ffff8801cbe6cb80, ffff8801cbe6cca0)
The buggy address belongs to the page:
page:ffffea00072f9b00 count:1 mapcount:0 mapping:ffff8801cbe6c080 index:0x0
flags: 0x2fffc0000000100(slab)
raw: 02fffc0000000100 ffff8801cbe6c080 0000000000000000 000000010000000b
raw: ffffea00072f8ca0 ffffea00072f8da0 ffff8801dc812c80 ffff8801de41a740
page dumped because: kasan: bad access detected
page->mem_cgroup:ffff8801de41a740

Memory state around the buggy address:
ffff8801cbe6ca80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801cbe6cb00: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff8801cbe6cb80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff8801cbe6cc00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff8801cbe6cc80: fb fb fb fb fc fc fc fc fc fc fc fc fb fb fb fb
==================================================================


= About RaceFuzzer

RaceFuzzer is a customized version of Syzkaller, specifically tailored
to find race condition bugs in the Linux kernel. While we leverage
many different technique, the notable feature of RaceFuzzer is in
leveraging a custom hypervisor (QEMU/KVM) to interleave the
scheduling. In particular, we modified the hypervisor to intentionally
stall a per-core execution, which is similar to supporting per-core
breakpoint functionality. This allows RaceFuzzer to force the kernel
to deterministically trigger racy condition (which may rarely happen
in practice due to randomness in scheduling).

RaceFuzzer's C repro always pinpoints two racy syscalls. Since C
repro's scheduling synchronization should be performed at the user
space, its reproducibility is limited (reproduction may take from 1
second to 10 minutes (or even more), depending on a bug). This is
because, while RaceFuzzer precisely interleaves the scheduling at the
kernel's instruction level when finding this bug, C repro cannot fully
utilize such a feature. Please disregard all code related to
"should_hypercall" in the C repro, as this is only for our debugging
purposes using our own hypervisor.