KASAN error in [PATCH v3 7/8] proc/kcore: optimize multiple page reads
From: Dominique Martinet
Date: Tue Aug 28 2018 - 07:00:21 EST
> The current code does a full search of the segment list every time for
> every page. This is wasteful, since it's almost certain that the next
> page will be in the same segment. Instead, check if the previous segment
> covers the current page before doing the list search.
>
> Signed-off-by: Omar Sandoval <osandov@xxxxxx>
> ---
> fs/proc/kcore.c | 14 +++++++++++---
> 1 file changed, 11 insertions(+), 3 deletions(-)
>
> diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c
> index e2ca58d49938..25fefdd05ee5 100644
> --- a/fs/proc/kcore.c
> +++ b/fs/proc/kcore.c
> @@ -428,10 +428,18 @@ read_kcore(struct file *file, char __user *buffer,
> size_t buflen, loff_t *fpos)
> if ((tsz = (PAGE_SIZE - (start & ~PAGE_MASK))) > buflen)
> tsz = buflen;
>
> + m = NULL;
> while (buflen) {
> - list_for_each_entry(m, &kclist_head, list) {
> - if (start >= m->addr && start < (m->addr+m->size))
> - break;
> + /*
> + * If this is the first iteration or the address is not within
> + * the previous entry, search for a matching entry.
> + */
> + if (!m || start < m->addr || start >= m->addr + m->size) {
This line apparently triggers a KASAN warning since I rebooted on
4.19-rc1
This is 100% reproductible on my machine when the kdump service starts
(fedora28 x86_64 VM), here's the full stack (on 4.19-rc1):
[ 38.161102] BUG: KASAN: global-out-of-bounds in read_kcore+0xd5c/0xf20
[ 38.162123] Read of size 8 at addr ffffffffa6c0f770 by task kexec/6201
[ 38.163386] CPU: 16 PID: 6201 Comm: kexec Not tainted 4.19.0-rc1+ #13
[ 38.164374] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.2-0-gf9626ccb91-prebuilt.qemu-project.org 04/01/2014
[ 38.166211] Call Trace:
[ 38.166658] dump_stack+0x71/0xa9
[ 38.167443] print_address_description+0x65/0x22e
[ 38.168194] ? read_kcore+0xd5c/0xf20
[ 38.168778] kasan_report.cold.6+0x241/0x306
[ 38.169458] read_kcore+0xd5c/0xf20
[ 38.170018] ? open_kcore+0x1d0/0x1d0
[ 38.170605] ? avc_has_perm_noaudit+0x370/0x370
[ 38.171291] ? kasan_unpoison_shadow+0x30/0x40
[ 38.171973] ? kasan_kmalloc+0xbf/0xe0
[ 38.172562] ? kmem_cache_alloc_trace+0x105/0x200
[ 38.173289] ? open_kcore+0x5f/0x1d0
[ 38.173858] ? open_kcore+0x5f/0x1d0
[ 38.174428] ? deref_stack_reg+0xe0/0xe0
[ 38.175038] proc_reg_read+0x18b/0x220
[ 38.175652] ? proc_reg_unlocked_ioctl+0x210/0x210
[ 38.176399] __vfs_read+0xe1/0x6b0
[ 38.176930] ? __x64_sys_copy_file_range+0x450/0x450
[ 38.177723] ? do_filp_open+0x190/0x250
[ 38.178313] ? may_open_dev+0xc0/0xc0
[ 38.178886] ? __fsnotify_update_child_dentry_flags.part.3+0x330/0x330
[ 38.179883] ? __fsnotify_inode_delete+0x20/0x20
[ 38.180608] ? __inode_security_revalidate+0x8e/0xb0
[ 38.181378] vfs_read+0xde/0x2c0
[ 38.181889] ksys_read+0xb2/0x160
[ 38.182413] ? kernel_write+0x130/0x130
[ 38.183000] ? task_work_run+0x74/0x1c0
[ 38.183621] do_syscall_64+0xa0/0x2e0
[ 38.184183] ? async_page_fault+0x8/0x30
[ 38.184802] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 38.185610] RIP: 0033:0x7fc525d74091
[ 38.186155] Code: fe ff ff 50 48 8d 3d b6 b6 09 00 e8 59 05 02 00 66 0f 1f 84 00 00 00 00 00 48 8d 05 51 39 2d 00 8b 00 85 c0 75 13 31 c0 0f 05 <48> 3d 00 f0 ff ff 77 57 c3 66 0f 1f 44 00 00 41 54 49 89 d4 55 48
[ 38.188980] RSP: 002b:00007fffd6802a28 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[ 38.190153] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc525d74091
[ 38.191240] RDX: 0000000000010000 RSI: 00000000017f90b0 RDI: 0000000000000004
[ 38.192329] RBP: 0000000000010000 R08: 00007fc526043420 R09: 0000000000000001
[ 38.194593] R10: 00000000017e8010 R11: 0000000000000246 R12: 00000000017f90b0
[ 38.196823] R13: 0000000000000004 R14: 00007fffd6802ac8 R15: 00007fffd6802cb0
[ 38.200526] The buggy address belongs to the variable:
[ 38.202539] kclist_head+0x10/0x440
[ 38.205568] Memory state around the buggy address:
[ 38.207411] ffffffffa6c0f600: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 38.209648] ffffffffa6c0f680: 00 00 00 00 00 00 00 00 04 fa fa fa fa fa fa fa
[ 38.211812] >ffffffffa6c0f700: 00 00 00 00 00 fa fa fa fa fa fa fa 00 00 fa fa
[ 38.213936] ^
[ 38.216010] ffffffffa6c0f780: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
[ 38.218178] ffffffffa6c0f800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 38.220306] ==================================================================
where
[ 37.636589] read_kcore+0xd5c/0xf20
symbolizes to
[< none >] read_kcore+0xd5c/0xf20 fs/proc/kcore.c:454
, the above check.
I haven't checked but I think I am in the first case below:
if (&m->list == &kclist_head) {
meaning no address matched in the list, so you cannot check m->addr and
m->size in this case -- I'm afraid you will have to run through the list
just in case if that happens even if there likely won't be any match for
the next address either.
> + list_for_each_entry(m, &kclist_head, list) {
> + if (start >= m->addr &&
> + start < m->addr + m->size)
> + break;
> + }
> }
>
> if (&m->list == &kclist_head) {
--
Dominique Martinet