Re: [PATCH] zram: rework reset and destroy path
From: Minchan Kim
Date: Tue Feb 03 2015 - 18:42:18 EST
Hello, Sergey,
On Wed, Feb 04, 2015 at 01:15:06AM +0900, Sergey Senozhatsky wrote:
> We need to return set_capacity(disk, 0) from reset_store() back
> to zram_reset_device(), a catch by Ganesh Mahendran. Potentially,
> we can race set_capacity() calls from init and reset paths.
>
> The problem is that zram_reset_device() is also getting called
> from zram_exit(), which performs operations in misleading
> reversed order -- we first create_device() and then init it,
> while zram_exit() perform destroy_device() first and then does
> zram_reset_device(). This is done to remove sysfs group before
> we reset device, so we can continue with device reset/destruction
> not being raced by sysfs attr write (f.e. disksize).
>
> Apart from that, destroy_device() releases zram->disk (but we
> still have ->disk pointer), so we cannot acces zram->disk in
> later zram_reset_device() call, which may cause additional
> errors in the future.
>
> So, this patch rework and cleanup destroy path.
>
> 1) remove several unneeded goto labels in zram_init()
> 2) factor out zram_init() error path and zram_exit() into
> destroy_devices() function, which takes the number of devices
> to destroy as its argument.
> 3) remove sysfs group in destroy_devices() first, so we can
> reorder operations -- reset device (as expected) goes before
> disk destroy and queue cleanup. So we can always access ->disk
> in zram_reset_device().
> 4) and, finally, return set_capacity() back under ->init_lock.
>
> Reported-by: Ganesh Mahendran <opensource.ganesh@xxxxxxxxx>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@xxxxxxxxx>
Looks good to me. Minor nit below.
> ---
> drivers/block/zram/zram_drv.c | 71 ++++++++++++++++++-------------------------
> 1 file changed, 30 insertions(+), 41 deletions(-)
>
> diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
> index a32069f..7d2e86f 100644
> --- a/drivers/block/zram/zram_drv.c
> +++ b/drivers/block/zram/zram_drv.c
> @@ -734,8 +734,9 @@ static void zram_reset_device(struct zram *zram)
> zram->meta = NULL;
> /* Reset stats */
> memset(&zram->stats, 0, sizeof(zram->stats));
> -
> zram->disksize = 0;
> + set_capacity(zram->disk, 0);
> +
> up_write(&zram->init_lock);
> }
>
> @@ -828,7 +829,6 @@ static ssize_t reset_store(struct device *dev,
> /* Make sure all pending I/O is finished */
> fsync_bdev(bdev);
> zram_reset_device(zram);
> - set_capacity(zram->disk, 0);
>
> mutex_unlock(&bdev->bd_mutex);
> revalidate_disk(zram->disk);
> @@ -1114,15 +1114,29 @@ out:
> return ret;
> }
>
> -static void destroy_device(struct zram *zram)
> +static void destroy_devices(unsigned int nr)
> {
> - sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
> - &zram_disk_attr_group);
> + struct zram *zram;
> + unsigned int i;
>
> - del_gendisk(zram->disk);
> - put_disk(zram->disk);
> + for (i = 0; i < nr; i++) {
> + zram = &zram_devices[i];
> + /* remove sysfs first, so no one will perform disksize
> + * store while we destroying devices */
> + sysfs_remove_group(&disk_to_dev(zram->disk)->kobj,
> + &zram_disk_attr_group);
>
> - blk_cleanup_queue(zram->queue);
> + zram_reset_device(zram);
> +
> + del_gendisk(zram->disk);
> + put_disk(zram->disk);
> +
> + blk_cleanup_queue(zram->queue);
> + }
> +
> + kfree(zram_devices);
> + unregister_blkdev(zram_major, "zram");
> + pr_debug("Destroyed %u device(s)\n", nr);
Create_device just shows the number of created device so I think
no worth to emit per-device information in destroy_devices.
Let's just emit clean up done like old in zram_exit but
use pr_info instead of pr_debug.
Another concern is I'd like to keep per-device interface(e,g.
create_device, destroy_device) because there was requirement
to add new zram device dynamically. I guess you could remember
that. Although I didn't have a enough time to response,
Alex finally convinced me so I hope a contributor who have time
will do it if he has an interest about that.
For it, per-device creating/destroy interface looks better.
https://lkml.org/lkml/2014/8/8/142
Anyway, I cannot expect it happens sooner so I'm not strong
against your patch(ie, create_device, destroy_devices)
because I think we could do refactoring it when we need it.
Thanks.
> }
>
> static int __init zram_init(void)
> @@ -1132,64 +1146,39 @@ static int __init zram_init(void)
> if (num_devices > max_num_devices) {
> pr_warn("Invalid value for num_devices: %u\n",
> num_devices);
> - ret = -EINVAL;
> - goto out;
> + return -EINVAL;
> }
>
> zram_major = register_blkdev(0, "zram");
> if (zram_major <= 0) {
> pr_warn("Unable to get major number\n");
> - ret = -EBUSY;
> - goto out;
> + return -EBUSY;
> }
>
> /* Allocate the device array and initialize each one */
> zram_devices = kzalloc(num_devices * sizeof(struct zram), GFP_KERNEL);
> if (!zram_devices) {
> ret = -ENOMEM;
> - goto unregister;
> + goto out_error;
> }
>
> for (dev_id = 0; dev_id < num_devices; dev_id++) {
> ret = create_device(&zram_devices[dev_id], dev_id);
> if (ret)
> - goto free_devices;
> + goto out_error;
> }
>
> - pr_info("Created %u device(s) ...\n", num_devices);
> -
> + pr_info("Created %u device(s)\n", num_devices);
> return 0;
>
> -free_devices:
> - while (dev_id)
> - destroy_device(&zram_devices[--dev_id]);
> - kfree(zram_devices);
> -unregister:
> - unregister_blkdev(zram_major, "zram");
> -out:
> +out_error:
> + destroy_devices(dev_id);
> return ret;
> }
>
> static void __exit zram_exit(void)
> {
> - int i;
> - struct zram *zram;
> -
> - for (i = 0; i < num_devices; i++) {
> - zram = &zram_devices[i];
> -
> - destroy_device(zram);
> - /*
> - * Shouldn't access zram->disk after destroy_device
> - * because destroy_device already released zram->disk.
> - */
> - zram_reset_device(zram);
> - }
> -
> - unregister_blkdev(zram_major, "zram");
> -
> - kfree(zram_devices);
> - pr_debug("Cleanup done!\n");
> + destroy_devices(num_devices);
> }
>
> module_init(zram_init);
> --
> 2.3.0.rc2
>
--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/