Re: kasan behavior when built with unsupported compiler
From: Alexander Potapenko
Date: Tue Mar 07 2017 - 13:55:06 EST
On Tue, Mar 7, 2017 at 6:33 PM, Nikolay Borisov
<n.borisov.lkml@xxxxxxxxx> wrote:
>
>
> On 7.03.2017 18:05, Alexander Potapenko wrote:
>> On Tue, Mar 7, 2017 at 4:54 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>>> On Tue, Mar 7, 2017 at 4:35 PM, Nikolay Borisov
>>> <n.borisov.lkml@xxxxxxxxx> wrote:
>>>> Hello,
>>>>
>>>> I've been chasing a particular UAF as reported by kasan
>>>> (https://www.spinics.net/lists/kernel/msg2458136.html). However, one
>>>> thing which I took notice of rather lately is that I was building my
>>>> kernel with gcc 4.7.4 which is not supported by kasan as indicated by
>>>> the following string:
>>>>
>>>> scripts/Makefile.kasan:19: Cannot use CONFIG_KASAN:
>>>> -fsanitize=kernel-address is not supported by compiler
>>>>
>>>>
>>>> Nevertheless, the kernel compiles and when I boot it I see the kasan
>>>> splats as per the referenced thread. If, however, I build the kernel
>>>> with a newer compiler version 5.4.0 kasan no longer complains.
>>>>
>>>>
>>>> At this point I'm wondering whether the splats can be due to old
>>>> compiler being used e.g. false positives or are they genuine splats and
>>>> gcc 5 somehow obfuscates them ? Clearly despite the warning about not
>>>> being able to use CONFIG_KASAN it is still working since I'm seeing the
>>>> splats. Is this valid behavior ?
>>>
>>>
>>> Hi,
>>>
>>> Re the message that kasan is not supported while it's still enabled in the end.
>>> I think it's an issue related to gcc plugins. Originally kasan was
>>> supported with 5.0+ thus the message. However, later we extended this
>>> support to 4.5+ with gcc plugins. However, that confusing message from
>>> build system was not fixed. So yes, it's confusing and it's something
>>> to fix, but mostly you can just ignore it.
>>>
>>> Re false positives with 4.7. By default I would assume that it is true
>>> positive. Should be easy to check with manual printfs.
>>>
>>> Re why 5.4 does not detect it. Difficult to say.
>>> If you confirm that it's a real bug and provide repro instructions,
>>> then I can recheck it with latest gcc. If it's a real bug and the
>>> latest gcc does not detect it, then we need to look more closely at
>>> it. I afraid 5.4 won't be fixed.
>>> It's also possible that it's a false positive in the old compiler (I
>>> think there were some bugs). If so, I would recommend switching to a
>>> newer compiler.
>>
>> I wonder if there's actual KASAN instrumentation in the kernel in
>> question built with GCC 4.7.
>> If there's none, there's little point in investigating this further,
>> as the tool is anyway barely usable.
>> Note that the report originates from something like copy_to_user() (or
>> hard to tell the exact place - Nikolay, could you please symbolize the
>> report?), i.e. it could be triggered even without KASAN
>> instrumentation.
>
> Of course there is kasan instrumentation, otherwise I won't see kasan reports, no ?
Not necessarily.
There's KASAN instrumentation inserted by the compiler, and KASAN
instrumentation added manually to the places the compiler can't
instrument.
> I bisected this to commit 1771c6e1a567ea0ba2cc which adds user memory access API
Commit 1771c6e1a567ea0ba2cc added exactly these calls to
check_memory_region() you are seeing.
If there is any other instrumentation inserted by the compiler, it's
possible that it may catch accesses to the same object (if something
else except copy_to_user() is being done).
Otherwise the only thing you can do to investigate this bug with GCC
4.7 is to bisect further by rolling to earlier revisions and applying
1771c6e1a567ea0ba2cc on top of them.
I won't be surprised though if at some point the bisection may stop
for a different reason.
> instrumentation. Here is a symbolized report:
>
> ==================================================================
> BUG: KASAN: slab-out-of-bounds in filldir+0xc8/0x170 at addr ffff88006a22560e
> Read of size 20 by task systemd/1
> =============================================================================
> BUG kmalloc-96 (Not tainted): kasan: bad access detected
> -----------------------------------------------------------------------------
>
> Disabling lock debugging due to kernel taint
> INFO: Allocated in ext4_htree_store_dirent+0x3e/0x120 age=0 cpu=2 pid=1
> [< none >] ___slab_alloc+0x636/0x6a0 mm/slub.c:2446
> [< none >] __slab_alloc+0x4f/0x86 mm/slub.c:2475
> [< inline >] slab_alloc_node mm/slub.c:2538
> [< inline >] slab_alloc mm/slub.c:2580
> [< none >] __kmalloc+0x27a/0x340 mm/slub.c:3561
> [< inline >] kmalloc include/linux/slab.h:483
> [< inline >] kzalloc include/linux/slab.h:622
> [< none >] ext4_htree_store_dirent+0x3e/0x120 fs/ext4/dir.c:447
> [< none >] htree_dirblock_to_tree+0x16a/0x190 fs/ext4/namei.c:1001
> [< none >] ext4_htree_fill_tree+0xaa/0x310 fs/ext4/namei.c:1075
> [< inline >] ext4_dx_readdir fs/ext4/dir.c:571
> [< none >] ext4_readdir+0x698/0x950 fs/ext4/dir.c:121
> [< none >] iterate_dir+0x7d/0x190 fs/readdir.c:50
> [< inline >] SYSC_getdents fs/readdir.c:230
> [< none >] SyS_getdents+0x91/0x120 fs/readdir.c:212
> [< none >] entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207
>
> INFO: Freed in ext4_ext_map_blocks+0x7f9/0x23e0 age=1 cpu=2 pid=1
> [< none >] __slab_free+0x31b/0x440 mm/slub.c:2657
> [< inline >] slab_free mm/slub.c:2810
> [< none >] kfree+0x27f/0x2d0 mm/slub.c:3662
> [< none >] ext4_ext_map_blocks+0x7f9/0x23e0 fs/ext4/extents.c:4619
> [< none >] ext4_map_blocks+0x3b4/0x5b0 fs/ext4/inode.c:529
> [< none >] ext4_getblk+0x54/0x1a0 fs/ext4/inode.c:929
> [< none >] ext4_bread+0x13/0x90 fs/ext4/inode.c:979
> [< none >] __ext4_read_dirblock+0x3f/0x380 fs/ext4/namei.c:99
> [< none >] htree_dirblock_to_tree+0x48/0x190 fs/ext4/namei.c:959
> [< none >] ext4_htree_fill_tree+0xaa/0x310 fs/ext4/namei.c:1075
> [< inline >] ext4_dx_readdir fs/ext4/dir.c:571
> [< none >] ext4_readdir+0x698/0x950 fs/ext4/dir.c:121
> [< none >] iterate_dir+0x7d/0x190 fs/readdir.c:50
> [< inline >] SYSC_getdents fs/readdir.c:230
> [< none >] SyS_getdents+0x91/0x120 fs/readdir.c:212
> [< none >] entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207
> INFO: Slab 0xffffea0001a88900 objects=20 used=17 fp=0xffff88006a224e10 flags=0x4080
> INFO: Object 0xffff88006a2255e0 @offset=5600 fp=0x45b282a2484c60d4
>
> Bytes b4 ffff88006a2255d0: 02 00 00 00 01 00 00 00 c9 ac fb ff 00 00 00 00 ................
> Object ffff88006a2255e0: d4 60 4c 48 a2 82 b2 45 18 8a 82 6a 00 88 ff ff .`LH...E...j....
> Object ffff88006a2255f0: 38 51 22 6a 00 88 ff ff 88 8b 82 6a 00 88 ff ff 8Q"j.......j....
> Object ffff88006a225600: 00 00 00 00 00 00 00 00 28 03 08 00 14 01 66 62 ........(.....fb
> Object ffff88006a225610: 64 65 76 2d 62 6c 61 63 6b 6c 69 73 74 2e 63 6f dev-blacklist.co
> Object ffff88006a225620: 6e 66 00 00 00 00 00 00 00 00 00 00 00 00 00 00 nf..............
> Object ffff88006a225630: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
> CPU: 2 PID: 1 Comm: systemd Tainted: G B 4.7.0-nbor #171
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
> 0000000000000000 ffff88006cd97c58 ffffffff8146bd4c ffff8800000946c0
> ffff88006a2255e0 ffff88006cd97c88 ffffffff81198d96 ffff8800000946c0
> ffffea0001a88900 ffff88006a2255e0 0000000000000000 ffff88006cd97cb0
> Call Trace:
> [< inline >] __dump_stack lib/dump_stack.c:15
> [<ffffffff8146bd4c>] dump_stack+0x85/0xc9 lib/dump_stack.c:51
> [<ffffffff81198d96>] print_trailer+0x116/0x190 mm/slub.c:667
> [<ffffffff811992c1>] object_err+0x41/0x50 mm/slub.c:674
> [< inline >] print_address_description mm/kasan/report.c:180
> [< inline >] kasan_report_error mm/kasan/report.c:276
> [<ffffffff811a0a42>] kasan_report+0x282/0x530 mm/kasan/report.c:298
> [< inline >] check_memory_region_inline mm/kasan/kasan.c:292
> [<ffffffff8119ffa7>] check_memory_region+0x137/0x160 mm/kasan/kasan.c:299
> [<ffffffff811a0041>] kasan_check_read+0x11/0x20 mm/kasan/kasan.c:304
> [< inline >] copy_to_user ./arch/x86/include/asm/uaccess.h:760
> [<ffffffff811ccc08>] filldir+0xc8/0x170 fs/readdir.c:195
> [< inline >] dir_emit include/linux/fs.h:3134
> [<ffffffff8124af38>] call_filldir+0x88/0x140 fs/ext4/dir.c:510
> [< inline >] ext4_dx_readdir fs/ext4/dir.c:586
> [<ffffffff8124b934>] ext4_readdir+0x714/0x950 fs/ext4/dir.c:121
> [<ffffffff811ccd2d>] iterate_dir+0x7d/0x190 fs/readdir.c:50
> [< inline >] SYSC_getdents fs/readdir.c:230
> [<ffffffff811ccf71>] SyS_getdents+0x91/0x120 fs/readdir.c:212
> [<ffffffff816d7d80>] entry_SYSCALL_64_fastpath+0x23/0xc1 arch/x86/entry/entry_64.S:207
> Memory state around the buggy address:
> ffff88006a225500: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88006a225580: fc fc fc fc fc fc fc fc fc fc fc fc 00 00 00 00
>>ffff88006a225600: 00 00 00 00 05 fc fc fc fc fc fc fc fc fc fc fc
> ^
> ffff88006a225680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> ffff88006a225700: fc fc fc fc fc fc fc fc fc fc fc fc fc fc 00 00
I think I've seen relevant reports on the Chrome OS 3.14/3.18 kernel
with Clang and GCC 6.0.0.
I never managed to reproduce those reliably, they would always go away
every time I tried to catch them.
You can try enabling some of the ext4 debug checks, but I still think
it won't help much as long as you're using an unsupported compiler.
--
Alexander Potapenko
Software Engineer
Google Germany GmbH
Erika-Mann-StraÃe, 33
80636 MÃnchen
GeschÃftsfÃhrer: Matthew Scott Sucherman, Paul Terence Manicle
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg