Re: md related oops triggered in bdev_inode_switch_bdi

From: Wu Fengguang
Date: Wed Aug 31 2011 - 23:31:16 EST


Hi Neil,

> Subject: [PATCH] Avoid dereferencing a 'request_queue' after last close.

Reviewed-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>

with comments below.

> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -1430,6 +1430,12 @@ static int __blkdev_put(struct block_device *bdev, fmode_t mode, int for_part)
> sync_blockdev(bdev);
> kill_bdev(bdev);
> }
> + if (!bdev->bd_openers)
> + /* ->release can cause the old bdi to disappear,
> + * so must switch it out first
> + */
> + bdev_inode_switch_bdi(bdev->bd_inode,
> + &default_backing_dev_info);
> if (bdev->bd_contains == bdev) {
> if (disk->fops->release)
> ret = disk->fops->release(disk, mode);

The bdev_inode_switch_bdi() call can be further moved into the
previous if block, like this:

if (!--bdev->bd_openers) {
WARN_ON_ONCE(bdev->bd_holders);
sync_blockdev(bdev);
kill_bdev(bdev);
+
+ /* ->release can cause the old bdi to disappear,
+ * so must switch it out first
+ */
+ bdev_inode_switch_bdi(bdev->bd_inode,
+ &default_backing_dev_info);
}

Then it's obvious that kill_bdev() will truncate all inode pages
and there won't be further interactions with dirty writes.

Although there are dozens of disk->fops->release functions, however
it's very unlikely they need to access some inode on top of the disk
(which is illogical thing).

So I don't see any problems. It makes sense to push it to next for
broader test ASAP. Will you do it, or me?

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/