Re: [PATCH 1/2] nbd: make nbd device wait for its users

From: Pranay Srivastava
Date: Sat Jun 25 2016 - 13:52:14 EST


On Fri, Jun 24, 2016 at 2:59 PM, Markus Pargmann <mpa@xxxxxxxxxxxxxx> wrote:
> From: "Pranay Kr. Srivastava" <pranjas@xxxxxxxxx>
>
> When a timeout occurs or a recv fails, then instead of abruplty killing
> nbd block device wait for it's users to finish.
>
> This is more required when filesystem(s) like ext2 or ext3 don't expect
> their buffer heads to disappear while the filesystem is mounted.
>
> Each open is counted. The blockdevice is kept open until the last user
> closes the block device. This offers the possibility as well to open a
> new socket to be used while the filesystems are mounted.
>
> Signed-off-by: Pranay Kr. Srivastava <pranjas@xxxxxxxxx>
>
> [mpa: Keep the blockdevice open until all users left]
> Signed-off-by: Markus Pargmann <mpa@xxxxxxxxxxxxxx>
> ---
> Hi,
>
> I used your patch and changed it a bit based on my ideas. The general
> difference is that this keeps the block device open. After all users left, the
> device is reset.
>
> The followup patch then restricts access to ioctls after a disconnect. I wanted
> to avoid that anyone sets up anything new without all the old users leaving.
>
> Please let me know what you think about this.
>
> Best Regards,
>
> Markus
>
> drivers/block/nbd.c | 62 ++++++++++++++++++++++++++++++++++++++---------------
> 1 file changed, 45 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
> index 1efc26bf1d21..620660f3ff0f 100644
> --- a/drivers/block/nbd.c
> +++ b/drivers/block/nbd.c
> @@ -69,6 +69,8 @@ struct nbd_device {
> #if IS_ENABLED(CONFIG_DEBUG_FS)
> struct dentry *dbg_dir;
> #endif
> + atomic_t users; /* Users that opened the block device */

We don't need to put bdev in nbd struct. We can get it via gendisk and
bdget/bdput.
> + struct block_device *bdev;
> };
>
> #if IS_ENABLED(CONFIG_DEBUG_FS)
> @@ -655,9 +657,26 @@ static int nbd_set_socket(struct nbd_device *nbd, struct socket *sock)
> return ret;
> }
>
> +static void nbd_bdev_reset(struct block_device *bdev)
> +{
> + set_device_ro(bdev, false);
> + bdev->bd_inode->i_size = 0;

i_size_write should be better I guess.

> + if (max_part > 0) {
> + blkdev_reread_part(bdev);
> + bdev->bd_invalidated = 1;
> + }
> +}
> +
> /* Reset all properties of an NBD device */
> static void nbd_reset(struct nbd_device *nbd)
> {
> + sock_shutdown(nbd);
> + nbd_clear_que(nbd);
> + if (nbd->bdev) {
> + kill_bdev(nbd->bdev);
> + nbd_bdev_reset(nbd->bdev);
> + }
> +
> nbd->disconnect = false;
> nbd->timedout = false;
> nbd->blksize = 1024;

I actually forgot to ask about this blksize. We do set_blocksize call
for bdev, but
shouldn't we instead do blkdev_logicial_blocksize?

Mount would set the blocksize depending on the filesystem's block size and the
block device logical block size supported. Should we just get rid of this?

> @@ -669,16 +688,6 @@ static void nbd_reset(struct nbd_device *nbd)
> del_timer_sync(&nbd->timeout_timer);
> }
>
> -static void nbd_bdev_reset(struct block_device *bdev)
> -{
> - set_device_ro(bdev, false);
> - bdev->bd_inode->i_size = 0;
> - if (max_part > 0) {
> - blkdev_reread_part(bdev);
> - bdev->bd_invalidated = 1;
> - }
> -}
> -
> static void nbd_parse_flags(struct nbd_device *nbd, struct block_device *bdev)
> {
> if (nbd->flags & NBD_FLAG_READ_ONLY)
> @@ -803,18 +812,11 @@ static int __nbd_ioctl(struct block_device *bdev, struct nbd_device *nbd,

How about disallowing any ioctl until the nbd_device has been reset?
Perhaps, throw error
in open instead of checking here?

> mutex_lock(&nbd->tx_lock);
> nbd->task_recv = NULL;
>
> - sock_shutdown(nbd);
> - nbd_clear_que(nbd);
> - kill_bdev(bdev);
> - nbd_bdev_reset(bdev);
> -
> if (nbd->disconnect) /* user requested, ignore socket errors */
> error = 0;
> if (nbd->timedout)
> error = -ETIMEDOUT;
>

Doesn't this change the semantics for user space? NBD_DO_IT is
supposed to be over either on error or a
user disconnect not before that right[?].

> - nbd_reset(nbd);
> -
> return error;
> }
>
> @@ -853,10 +855,35 @@ static int nbd_ioctl(struct block_device *bdev, fmode_t mode,
> return error;
> }
>
> +static int nbd_open(struct block_device *bdev, fmode_t mode)
> +{
> + struct nbd_device *nbd = bdev->bd_disk->private_data;
> +
> + atomic_inc(&nbd->users);
> +
> + if (!nbd->bdev)
> + nbd->bdev = bdev;
> +
> + return 0;
> +}
> +
> +static void nbd_release(struct gendisk *disk, fmode_t mode)
> +{
> + struct nbd_device *nbd = disk->private_data;
> +
> + if (atomic_dec_and_test(&nbd->users)) {
> + mutex_lock(&nbd->tx_lock);
> + nbd_reset(nbd);
> + mutex_unlock(&nbd->tx_lock);
> + }
> +}
> +
What if the following happens,

1) nbd-client [nbd-c1] attaches the block device

2) mount this nbd device

3) somebody goes crazy and does nbd-client -d <nbd_dev>

Step 3) would cause a DISCONNECT and a CLEAR_SOCK ioctl to be fired,
but in CLEAR_SOCK there's a call to kill_bdev, so I guess that needs
to go as well[?].

> static const struct block_device_operations nbd_fops = {
> .owner = THIS_MODULE,
> .ioctl = nbd_ioctl,
> .compat_ioctl = nbd_ioctl,
> + .open = nbd_open,
> + .release = nbd_release,
> };
>
> #if IS_ENABLED(CONFIG_DEBUG_FS)
> @@ -1087,6 +1114,7 @@ static int __init nbd_init(void)
> disk->private_data = &nbd_dev[i];
> sprintf(disk->disk_name, "nbd%d", i);
> nbd_reset(&nbd_dev[i]);
> + atomic_set(&nbd_dev[i].users, 0);
> add_disk(disk);
> }
>
> --
> 2.1.4
>

--
---P.K.S