Re: bio linked list corruption.

From: Linus Torvalds
Date: Wed Oct 26 2016 - 15:06:40 EST


On Wed, Oct 26, 2016 at 11:42 AM, Dave Jones <davej@xxxxxxxxxxxxxxxxx> wrote:
>
> The stacks show nearly all of them are stuck in sync_inodes_sb

That's just wb_wait_for_completion(), and it means that some IO isn't
completing.

There's also a lot of processes waiting for inode_lock(), and a few
waiting for mnt_want_write()

Ignoring those, we have

> [<ffffffffa009554f>] btrfs_wait_ordered_roots+0x3f/0x200 [btrfs]
> [<ffffffffa00470d1>] btrfs_sync_fs+0x31/0xc0 [btrfs]
> [<ffffffff811fbd4e>] sync_filesystem+0x6e/0xa0
> [<ffffffff811fbebc>] SyS_syncfs+0x3c/0x70
> [<ffffffff8100255c>] do_syscall_64+0x5c/0x170
> [<ffffffff817908cb>] entry_SYSCALL64_slow_path+0x25/0x25
> [<ffffffffffffffff>] 0xffffffffffffffff

Don't know this one. There's a couple of them. Could there be some
ABBA deadlock on the ordered roots waiting?

> [<ffffffff8131ae87>] call_rwsem_down_write_failed+0x17/0x30
> [<ffffffffa008ed32>] btrfs_fallocate+0xb2/0xfd0 [btrfs]
> [<ffffffff811c6c3e>] vfs_fallocate+0x13e/0x220
> [<ffffffff811c79f3>] SyS_fallocate+0x43/0x80
> [<ffffffff8100255c>] do_syscall_64+0x5c/0x170
> [<ffffffff817908cb>] entry_SYSCALL64_slow_path+0x25/0x25
> [<ffffffffffffffff>] 0xffffffffffffffff

This one is also inode_lock(), and is interesting only because it's
fallocate(), which has shown up so many times before..

But there are other threads blocked on do_truncate, or
btrfs_file_write_iter instead, or on lseek, so this is not different
for any other reason.

> [<ffffffff81149fbf>] wait_on_page_bit+0xaf/0xc0
> [<ffffffff8114a121>] __filemap_fdatawait_range+0x151/0x170
> [<ffffffff8114d79c>] filemap_fdatawait_keep_errors+0x1c/0x20
> [<ffffffff811f59b3>] sync_inodes_sb+0x273/0x300
> [<ffffffff811fbd37>] sync_filesystem+0x57/0xa0
> [<ffffffff811fbebc>] SyS_syncfs+0x3c/0x70
> [<ffffffff8100255c>] do_syscall_64+0x5c/0x170
> [<ffffffff817908cb>] entry_SYSCALL64_slow_path+0x25/0x25
> [<ffffffffffffffff>] 0xffffffffffffffff

This is actually waiting on the page. Possibly this is the IO that is
never completing, and keeps the inode lock.

> [<ffffffffa009576b>] btrfs_start_ordered_extent+0x5b/0xb0 [btrfs]
> [<ffffffffa008bf5d>] lock_and_cleanup_extent_if_need+0x22d/0x290 [btrfs]
> [<ffffffffa008d1e8>] __btrfs_buffered_write+0x1b8/0x6e0 [btrfs]
> [<ffffffffa0090e60>] btrfs_file_write_iter+0x170/0x550 [btrfs]
> [<ffffffff811c97d8>] do_iter_readv_writev+0xa8/0x100
> [<ffffffff811ca162>] do_readv_writev+0x172/0x210
> [<ffffffff811ca42a>] vfs_writev+0x3a/0x50
> [<ffffffff811ca5c0>] do_pwritev+0xb0/0xd0
> [<ffffffff811cb57c>] SyS_pwritev+0xc/0x10
> [<ffffffff8100255c>] do_syscall_64+0x5c/0x170
> [<ffffffff817908cb>] entry_SYSCALL64_slow_path+0x25/0x25

Hmm. This is the one that *started* the ordered extents (as opposed to
the ones waiting for it)

I dunno. There might be a lost IO. More likely it's the same
corruption that causes it, it just didn't result in an oops this time.

Linus