BUG: unable to handle kernel NULL pointer dereference when mounting/umounting vfat in 4.3.0, worked in 4.2.4

From: Mads LÃnsethagen
Date: Fri Nov 06 2015 - 15:44:04 EST


After updating from 4.2.4 to 4.3.0 I cannot seem to list files in my /boot-folder after mounting it, and I get a kernel BUG when I try to umount it.

exai ~ # mount /boot
exai ~ # sync
exai ~ # mount
[ ... snip ... ]
/dev/sda1 on /boot type vfat (rw,noatime,fmask=0022,dmask=0022,codepage=865,iocharset=utf8,shortname=mixed,errors=remount-ro)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,size=808900k,mode=700,uid=1000,gid=1000)
exai ~ # ls -l /boot
ls: cannot open directory /boot: No such device or address
exai ~ # umount /boot/
Killed
exai ~ # dmesg | tail -50
[ 47.959725] cfg80211: (5150000 KHz - 5250000 KHz @ 80000 KHz, 200000 KHz AUTO), (N/A, 2000 mBm), (N/A)
[ 47.959726] cfg80211: (5250000 KHz - 5350000 KHz @ 80000 KHz, 200000 KHz AUTO), (N/A, 2000 mBm), (0 s)
[ 47.959727] cfg80211: (5470000 KHz - 5725000 KHz @ 160000 KHz), (N/A, 2698 mBm), (0 s)
[ 47.959728] cfg80211: (57000000 KHz - 66000000 KHz @ 2160000 KHz), (N/A, 4000 mBm), (N/A)
[ 101.965931] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[ 101.966053] IP: [<ffffffff8110219e>] truncate_inode_pages_range+0x1e/0x6a0
[ 101.966152] PGD 838e7067 PUD 6c8db067 PMD 0
[ 101.966222] Oops: 0000 [#1] PREEMPT SMP
[ 101.966300] Modules linked in: iwlmvm iwlwifi vfat fat uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev x86_pkg_temp_thermal coretemp kvm_intel kvm microcode i2c_i801 iTCO_wdt xhci_pci xhci_hcd ideapad_laptop sparse_keymap int3403_thermal int3402_thermal processor_thermal_device int340x_thermal_zone intel_soc_dts_iosf int3400_thermal iosf_mbi acpi_thermal_rel intel_smartconnect efivarfs
[ 101.967059] CPU: 0 PID: 1311 Comm: umount Not tainted 4.3.0-gentoo #1
[ 101.967151] Hardware name: LENOVO 20266/Yoga2, BIOS 76CN42WW 03/02/2015
[ 101.967206] task: ffff880087a23000 ti: ffff88006c92c000 task.ti: ffff88006c92c000
[ 101.967269] RIP: 0010:[<ffffffff8110219e>] [<ffffffff8110219e>] truncate_inode_pages_range+0x1e/0x6a0
[ 101.967354] RSP: 0018:ffff88006c92fcd0 EFLAGS: 00010282
[ 101.967395] RAX: 0000000000000000 RBX: ffffffffffffffff RCX: 9e37fffffffc0001
[ 101.967453] RDX: ffffffffffffffff RSI: 0000000000000000 RDI: ffff88008897c770
[ 101.967512] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 101.967571] R10: ffff88008897c718 R11: 0000000000000000 R12: ffffffffa03468c0
[ 101.967630] R13: ffff88006c930000 R14: ffff8802532bd438 R15: ffff88008897c690
[ 101.967689] FS: 00007fabc7f61780(0000) GS:ffff88025f200000(0000) knlGS:0000000000000000
[ 101.967757] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 101.967802] CR2: 0000000000000028 CR3: 000000006c8df000 CR4: 00000000001406f0
[ 101.967880] Stack:
[ 101.967897] ffff88008897c770 0000000000000000 ffff880087a23000 0000000000000000
[ 101.967966] ffffffff81100678 0000000000000000 ffffffff810fefd6 ffff88006c92fe58
[ 101.968034] 00ffffff00000000 00000002900e19c0 ffffffff810fd640 ffff8802540b8248
[ 101.968102] Call Trace:
[ 101.968117] [<ffffffff81100678>] ? pagevec_lookup_tag+0x18/0x20
[ 101.968167] [<ffffffff810fefd6>] ? write_cache_pages+0xe6/0x390
[ 101.968215] [<ffffffff810fd640>] ? domain_dirty_limits+0xe0/0xe0
[ 101.968266] [<ffffffff81088273>] ? finish_task_switch+0x53/0x180
[ 101.968316] [<ffffffff810f54f6>] ? find_get_pages_tag+0x126/0x160
[ 101.968366] [<ffffffff8116bc02>] ? __inode_wait_for_writeback+0x62/0xb0
[ 101.968422] [<ffffffff8109c420>] ? autoremove_wake_function+0x30/0x30
[ 101.968478] [<ffffffffa03435a0>] ? fat_evict_inode+0x10/0x50 [fat]
[ 101.968530] [<ffffffff8115ffa3>] ? evict+0xb3/0x180
[ 101.968567] [<ffffffff8116009d>] ? dispose_list+0x2d/0x40
[ 101.968611] [<ffffffff81160e3a>] ? evict_inodes+0x13a/0x150
[ 101.968656] [<ffffffff81148e15>] ? generic_shutdown_super+0x35/0xe0
[ 101.968707] [<ffffffff8114914c>] ? kill_block_super+0x1c/0x60
[ 101.968754] [<ffffffff81149264>] ? deactivate_locked_super+0x34/0x60
[ 101.968806] [<ffffffff81163db6>] ? cleanup_mnt+0x36/0x80
[ 101.968860] [<ffffffff81082a7f>] ? task_work_run+0x6f/0x90
[ 101.968917] [<ffffffff810013f5>] ? prepare_exit_to_usermode+0x95/0xd0
[ 101.968971] [<ffffffff8175066f>] ? int_ret_from_sys_call+0x25/0x8f
[ 101.969021] Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 41 57 41 56 41 55 41 54 55 48 89 f5 53 48 89 d3 48 81 ec 10 01 00 00 48 8b 07 48 89 3c 24 <48> 8b 40 28 8b 80 08 04 00 00 85 c0 78 05 e8 cf 19 04 00 48 8b
[ 101.969295] RIP [<ffffffff8110219e>] truncate_inode_pages_range+0x1e/0x6a0
[ 101.969355] RSP <ffff88006c92fcd0>
[ 101.969377] CR2: 0000000000000028
[ 101.990401] ---[ end trace a5cb453620b7ad23 ]---
exai ~ #

I sent this in to bugzilla.kernel.org[1] thinking it had something to do with vfat, but Ogawa Hirofumi disassembled the trace:

OGAWA Hirofumi 2015-11-06 19:36:56 UTC
--------------------------------------

Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 41 57 41 56 41 55 41 54 55 48 89 f5 53 48 89 d3 48 81 ec 10 01 00 00 48 8b 07 48 89 3c 24 <48> 8b 40 28 8b 80 08 04 00 00 85 c0 78 05 e8 cf 19 04 00 48 8b

Disassemble of oops code

0: ff (bad)
1: ff c3 inc %ebx
3: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
a: 00 00 00
d: 41 57 push %r15
f: 41 56 push %r14
11: 41 55 push %r13
13: 41 54 push %r12
15: 55 push %rbp
16: 48 89 f5 mov %rsi,%rbp
19: 53 push %rbx
1a: 48 89 d3 mov %rdx,%rbx
1d: 48 81 ec 10 01 00 00 sub $0x110,%rsp
24: 48 8b 07 mov (%rdi),%rax
27: 48 89 3c 24 mov %rdi,(%rsp)
2b: 48 8b 40 28 mov 0x28(%rax),%rax
2f: 8b 80 08 04 00 00 mov 0x408(%rax),%eax
35: 85 c0 test %eax,%eax
37: 78 05 js 0x3e
39: e8 cf 19 04 00 callq 0x41a0d
3e: 48 rex.W
3f: 8b .byte 0x8b
40: a0 .byte 0xa0

24: %rdi would be mapping
%rax would be mapping->host
2b: 0x28(%rax) == mapping->host->i_sb
2f: 0x408(%rax) == mapping->host->i_sb->cleancache_poolid

And it seems to be host->i_sb == NULL then.

There is no change in v4.2..v4.3, so this is likely to be the bug of other
parts. It might be memory corruption, race, or such.

--------------------------------------

So, I don't know who I should CC: on this...

I can make an image of the vfat filesystem and put it up somewhere, it's only 32MB. Should I do that? And/or post the config.gz?


- Mads LÃnsethagen

[1] https://bugzilla.kernel.org/show_bug.cgi?id=107361
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/