Re: INFO: task hung in sync_blockdev

From: Jan Kara
Date: Thu Feb 08 2018 - 11:20:26 EST


On Thu 08-02-18 06:49:18, Andi Kleen wrote:
> > > It seems multiple processes deadlocked on the bd_mutex.
> > > Unfortunately there's no backtrace for the lock acquisitions,
> > > so it's hard to see the exact sequence.
> >
> > Well, all in the report points to a situation where some IO was submitted
> > to the block device and never completed (more exactly it took longer than
> > those 120s to complete that IO). It would need more digging into the
>
> Are you sure? I didn't think outstanding IO would take bd_mutex.

The stack trace is:

schedule+0xf5/0x430 kernel/sched/core.c:3480
io_schedule+0x1c/0x70 kernel/sched/core.c:5096
wait_on_page_bit_common+0x4b3/0x770 mm/filemap.c:1099
wait_on_page_bit mm/filemap.c:1132 [inline]
wait_on_page_writeback include/linux/pagemap.h:546 [inline]
__filemap_fdatawait_range+0x282/0x430 mm/filemap.c:533
filemap_fdatawait_range mm/filemap.c:558 [inline]
filemap_fdatawait include/linux/fs.h:2590 [inline]
filemap_write_and_wait+0x7a/0xd0 mm/filemap.c:624
__sync_blockdev fs/block_dev.c:448 [inline]
sync_blockdev.part.29+0x50/0x70 fs/block_dev.c:457
sync_blockdev fs/block_dev.c:444 [inline]
__blkdev_put+0x18b/0x7f0 fs/block_dev.c:1763
blkdev_put+0x85/0x4f0 fs/block_dev.c:1835
blkdev_close+0x8b/0xb0 fs/block_dev.c:1842
__fput+0x327/0x7e0 fs/file_table.c:209
____fput+0x15/0x20 fs/file_table.c:243


So we are waiting for PageWriteback on some page. And bd_mutex is grabbed
by this process in __blkdev_put() before calling sync_blockdev().

Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR