Re: 9p2000.L stat/unlink race (WARNING: fs/inode.c:280 drop_nlink)

From: Vegard Nossum
Date: Tue Aug 23 2016 - 17:17:15 EST


On 08/21/2016 01:55 PM, Vegard Nossum wrote:
Hi,

I've been seeing this:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 2377 at fs/inode.c:280 drop_nlink+0xc2/0x120
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 2377 Comm: syz-executor Not tainted 4.8.0-rc2+ #145
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
0000000000000000 ffff880116437b50 ffffffff81f9f201 ffffffff83e72d60
ffff880116437c28 ffffffff83ef6100 ffffffff847070c0 ffff880116437c18
ffffffff8150538a 0000000041b58ab3 ffffffff844dc5dd ffffffff815051a7
Call Trace:
[<ffffffff81f9f201>] dump_stack+0x83/0xb2
[<ffffffff8150538a>] panic+0x1e3/0x3a1
[<ffffffff815051a7>] ? set_ti_thread_flag+0x1e/0x1e
[<ffffffff81505a58>] ? rt_mutex_top_waiter.part.10+0x6/0x6
[<ffffffff83a4af46>] ? p9_idpool_put+0x36/0x40
[<ffffffff816c85d2>] ? drop_nlink+0xc2/0x120
[<ffffffff8126df4f>] __warn+0x1bf/0x1e0
[<ffffffff8126e13c>] warn_slowpath_null+0x2c/0x40
[<ffffffff816c85d2>] drop_nlink+0xc2/0x120
[<ffffffff81b17106>] v9fs_remove+0x256/0x2d0
[<ffffffff81b1719f>] v9fs_vfs_unlink+0x1f/0x30
[<ffffffff816a7130>] vfs_unlink+0x2b0/0x490
[<ffffffff816a7840>] do_unlinkat+0x530/0x660
[<ffffffff816a7310>] ? vfs_unlink+0x490/0x490
[<ffffffff81002f2f>] ? syscall_trace_enter+0x3cf/0xdb0
[<ffffffff8150465a>] ? __context_tracking_exit.part.4+0x9a/0x1e0
[<ffffffff81002b60>] ? exit_to_usermode_loop+0x190/0x190
[<ffffffff82001b07>] ? check_preemption_disabled+0x37/0x1e0
[<ffffffff82001cec>] ? __this_cpu_preempt_check+0x1c/0x20
[<ffffffff816a79f0>] ? SyS_unlinkat+0x80/0x80
[<ffffffff816a7a0a>] SyS_unlink+0x1a/0x20
[<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
[<ffffffff83c3286a>] entry_SYSCALL64_slow_path+0x25/0x25

It seems to happen when fstat() calls into v9fs_vfs_getattr_dotl() and
that does a v9fs_stat2inode_dotl() which changes inode->i_nlink from 1
to 0 (without any locking). The unlink() which was already in progress
then sees 0 when it does drop_nlink().

My mount flags are:

trans=virtio,version=9p2000.L

If I add "cache=loose" then there is no warning (as expected, since
v9fs_vfs_getattr_dotl() bails out early in that case and doesn't change
the in-memory inode).

With cache=loose I'm also seeing this, which looks to me like it might
be a related race:

------------[ cut here ]------------
WARNING: CPU: 0 PID: 5224 at mm/truncate.c:761 pagecache_isize_extended+0x188/0x1f0
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 5224 Comm: syz-executor Not tainted 4.8.0-rc2+ #161
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
0000000000000000 ffff88010f0dfae8 ffffffff81f9f1c1 ffffffff83e72d60
ffff88010f0dfbc0 ffffffff83ed1a00 ffffffff847070c0 ffff88010f0dfbb0
ffffffff8150534a 0000000041b58ab3 ffffffff844dc5cd ffffffff81505167
Call Trace:
[<ffffffff81f9f1c1>] dump_stack+0x83/0xb2
[<ffffffff8150534a>] panic+0x1e3/0x3a1
[<ffffffff81505167>] ? set_ti_thread_flag+0x1e/0x1e
[<ffffffff81505a18>] ? rt_mutex_top_waiter.part.10+0x6/0x6
[<ffffffff81544828>] ? pagecache_isize_extended+0x188/0x1f0
[<ffffffff8126df4f>] __warn+0x1bf/0x1e0
[<ffffffff8126e13c>] warn_slowpath_null+0x2c/0x40
[<ffffffff81544828>] pagecache_isize_extended+0x188/0x1f0
[<ffffffff815448f8>] truncate_setsize+0x68/0xa0
[<ffffffff81b1c173>] v9fs_vfs_setattr_dotl+0x363/0x510
[<ffffffff81b1be10>] ? v9fs_mapped_iattr_valid+0x1f0/0x1f0
[<ffffffff81e5a570>] ? apparmor_path_rmdir+0x30/0x30
[<ffffffff813a07b6>] ? timespec_trunc+0x76/0xe0
[<ffffffff81b1be10>] ? v9fs_mapped_iattr_valid+0x1f0/0x1f0
[<ffffffff816cfa0d>] notify_change+0x6dd/0xdd0
[<ffffffff8166c3f1>] do_truncate+0x121/0x1c0
[<ffffffff8166c2d0>] ? do_dentry_open+0xbc0/0xbc0
[<ffffffff81e5a570>] ? apparmor_path_rmdir+0x30/0x30
[<ffffffff8166c7db>] do_sys_ftruncate.constprop.9+0x34b/0x500
[<ffffffff8166d060>] ? compat_SyS_truncate+0x30/0x30
[<ffffffff8166d07c>] SyS_ftruncate+0x1c/0x30
[<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
[<ffffffff810052fc>] ? syscall_return_slowpath+0x16c/0x1d0
[<ffffffff83c3282a>] entry_SYSCALL64_slow_path+0x25/0x25


Vegard