RE: "fs/namei.c: keep track of nd->root refcount status" causes boot panic
From: Dexuan Cui
Date: Tue Sep 03 2019 - 01:52:39 EST
> From: Dexuan-Linux Cui <dexuan.linux@xxxxxxxxx>
> Sent: Monday, September 2, 2019 10:22 PM
> To: Qian Cai <cai@xxxxxx>
> Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>; linux-fsdevel@xxxxxxxxxxxxxxx; LKML
> <linux-kernel@xxxxxxxxxxxxxxx>; Dexuan Cui <decui@xxxxxxxxxxxxx>; Lili Deng
> (Wicresoft North America Ltd) <v-lide@xxxxxxxxxxxxx>
> Subject: Re: "fs/namei.c: keep track of nd->root refcount status" causes boot
> panic
>
> On Mon, Sep 2, 2019 at 9:22 PM Qian Cai <cai@xxxxxx> wrote:
> >
> > The linux-next commit "fs/namei.c: keep track of nd->root refcount statusâ
> [1] causes boot panic on all
> > architectures here on todayâs linux-next (0902). Reverted it will fix the issue.
>
> I believe I'm seeing the same issue with next-20190902 in a Linux VM
> running on Hyper-V (next-20190830 is good).
>
> git-bisect points to the same commit in linux-next:
> e013ec23b823 ("fs/namei.c: keep track of nd->root refcount status")
>
> I can reproduce the issue every time I reboot the system.
>
> Thanks,
> Dexuan
BTW, I tried the patch https://lkml.org/lkml/2019/8/31/158 -- not helpful at all.
FYI: this is my call-trace:
[ 16.843452] Run /init as init process
Loading, please wait...
starting version 239
[ 16.936476] ------------[ cut here ]------------
[ 16.937929] DEBUG_LOCKS_WARN_ON(!test_bit(class_idx, lock_classes_in_use))
[ 16.937929] WARNING: CPU: 10 PID: 366 at kernel/locking/lockdep.c:3850 __lock_acquire.isra.34+0x50c/0x560
[ 16.937929] Modules linked in:
[ 16.937929] CPU: 10 PID: 366 Comm: udevadm Not tainted 5.3.0-rc1+ #26
[ 16.937929] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS 090008 12/07/2018
[ 16.937929] RIP: 0010:__lock_acquire.isra.34+0x50c/0x560
[ 16.937929] Code: 00 85 c0 0f 84 72 fe ff ff 8b 1d af 5b 2b 01 85 db 0f 85 64 fe ff ff 48 c7 c6 08 97 07...
[ 16.937929] RSP: 0018:ffffc90003ff3c40 EFLAGS: 00010086
[ 16.937929] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
[ 16.937929] RDX: ffffffff810e3d63 RSI: 0000000000000001 RDI: ffffffff822628a0
[ 16.937929] RBP: 0000000000000000 R08: ffffffff82c0e420 R09: 0000000000039440
[ 16.937929] R10: 0000001209f646b6 R11: 000000000000016e R12: ffff888276440040
[ 16.937929] R13: 0000000000000000 R14: 0000000000000000 R15: ffff888276440818
[ 16.937929] FS: 00007f4ee2f0f8c0(0000) GS:ffff88827d700000(0000) knlGS:0000000000000000
[ 16.937929] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 16.937929] CR2: 000055dce7403000 CR3: 0000000276772003 CR4: 00000000003606e0
[ 16.937929] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 16.937929] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 16.937929] Call Trace:
[ 16.937929] lock_acquire+0xae/0x160
[ 16.937929] ? dput.part.34+0x164/0x380
[ 16.937929] ? dput.part.34+0x29/0x380
[ 16.937929] _raw_spin_lock+0x2c/0x40
[ 16.937929] ? dput.part.34+0x164/0x380
[ 16.937929] dput.part.34+0x164/0x380
[ 17.098529] terminate_walk+0xde/0x100
[ 17.098529] path_lookupat.isra.62+0xa3/0x220
[ 17.098529] filename_lookup.part.77+0xa0/0x170
[ 17.098529] ? kmem_cache_alloc+0x169/0x2a0
[ 17.098529] do_readlinkat+0x5d/0x110
[ 17.098529] __x64_sys_readlinkat+0x1a/0x20
[ 17.098529] do_syscall_64+0x5d/0x1c0
[ 17.098529] ? prepare_exit_to_usermode+0x7b/0xb0
[ 17.098529] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 17.098529] RIP: 0033:0x7f4ee378da4a
[ 17.098529] Code: 48 8b 0d 49 84 0d 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 ...
[ 17.098529] RSP: 002b:00007fffbddb7968 EFLAGS: 00000202 ORIG_RAX: 000000000000010b
[ 17.098529] RAX: ffffffffffffffda RBX: 000055dce740b220 RCX: 00007f4ee378da4a
[ 17.098529] RDX: 000055dce740b220 RSI: 000055dce740b201 RDI: 0000000000000005
[ 17.098529] RBP: 0000000000000064 R08: 000055dce73fa010 R09: 0000000000000000
[ 17.098529] R10: 0000000000000063 R11: 0000000000000202 R12: 000055dce740b201
[ 17.098529] R13: 0000000000000005 R14: 00007fffbddb79f8 R15: 0000000000000063
[ 17.098529] ---[ end trace 6af6f6ebcc3937e8 ]---
It looks the aforementioned patch causes a memory corruption.
If I revert the patch, everything will be back to normal.
Thanks,
Dexuan