Re: INFO: task hung in generic_file_write_iter

From: Dmitry Vyukov
Date: Tue Jan 08 2019 - 06:49:23 EST

Next message: Anup Patel: "Re: [PATCH v2 5/8] RISC-V: Compare cpuid with NR_CPUS before mapping."
Previous message: Anup Patel: "Re: [PATCH v2 4/8] RISC-V: Allow hartid-to-cpuid function to fail."
In reply to: Jan Kara: "Re: INFO: task hung in generic_file_write_iter"
Next in thread: Jan Kara: "Re: INFO: task hung in generic_file_write_iter"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Tue, Jan 8, 2019 at 12:24 PM Jan Kara <jack@xxxxxxx> wrote:
>
> On Tue 08-01-19 19:04:06, Tetsuo Handa wrote:
> > On 2019/01/03 2:26, Jan Kara wrote:
> > > On Thu 03-01-19 01:07:25, Tetsuo Handa wrote:
> > >> On 2019/01/02 23:40, Jan Kara wrote:
> > >>> I had a look into this and the only good explanation for this I have is
> > >>> that sb->s_blocksize is different from (1 << sb->s_bdev->bd_inode->i_blkbits).
> > >>> If that would happen, we'd get exactly the behavior syzkaller observes
> > >>> because grow_buffers() would populate different page than
> > >>> __find_get_block() then looks up.
> > >>>
> > >>> However I don't see how that's possible since the filesystem has the block
> > >>> device open exclusively and blkdev_bszset() makes sure we also have
> > >>> exclusive access to the block device before changing the block device size.
> > >>> So changing block device block size after filesystem gets access to the
> > >>> device should be impossible.
> > >>>
> > >>> Anyway, could you perhaps add to your debug patch a dump of 'size' passed
> > >>> to __getblk_slow() and bdev->bd_inode->i_blkbits? That should tell us
> > >>> whether my theory is right or not. Thanks!
> > >>>
> >
> > Got two reports. 'size' is 512 while bdev->bd_inode->i_blkbits is 12.
> >
> > https://syzkaller.appspot.com/text?tag=CrashLog&x=1237c3ab400000
> >
> > [ 385.723941][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12
> > (...snipped...)
> > [ 568.159544][ T439] kworker/u4:3(439): getblk(): executed=9 bh_count=0 bh_state=0 bdev_super_blocksize=512 size=512 bdev_super_blocksize_bits=9 bdev_inode_blkbits=12
>
> Right, so indeed the block size in the superblock and in the block device
> gets out of sync which explains why we endlessly loop in the buffer cache
> code. The superblock uses blocksize of 512 while the block device thinks
> the set block size is 4096.
>
> And after staring into the code for some time, I finally have a trivial
> reproducer:
>
> truncate -s 1G /tmp/image
> losetup /dev/loop0 /tmp/image
> mkfs.ext4 -b 1024 /dev/loop0
> mount -t ext4 /dev/loop0 /mnt
> losetup -c /dev/loop0
> l /mnt
> <hangs>
>
> And the problem is that LOOP_SET_CAPACITY ioctl ends up reseting block
> device block size to 4096 by calling bd_set_size(). I have to think how to
> best fix this...
>
> Thanks for your help with debugging this!

Wow! I am very excited.
We have 587 open "task hung" reports, I suspect this explains lots of them.
What would be some pattern that we can use to best-effort distinguish
most manifestations? Skimming through few reports I see "inode_lock",
"get_super", "blkdev_put" as common indicators. Anything else?

Next message: Anup Patel: "Re: [PATCH v2 5/8] RISC-V: Compare cpuid with NR_CPUS before mapping."
Previous message: Anup Patel: "Re: [PATCH v2 4/8] RISC-V: Allow hartid-to-cpuid function to fail."
In reply to: Jan Kara: "Re: INFO: task hung in generic_file_write_iter"
Next in thread: Jan Kara: "Re: INFO: task hung in generic_file_write_iter"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]