Re: [PATCH V2 for-6.10/block 1/2] loop: Fix a race between loop detach and loop open

From: Jens Axboe
Date: Tue May 21 2024 - 21:39:49 EST


On 5/21/24 4:42 PM, Gulam Mohamed wrote:
> Description
> ===========
>
> 1. Userspace sends the command "losetup -d" which uses the open() call
> to open the device
> 2. Kernel receives the ioctl command "LOOP_CLR_FD" which calls the
> function loop_clr_fd()
> 3. If LOOP_CLR_FD is the first command received at the time, then the
> AUTOCLEAR flag is not set and deletion of the
> loop device proceeds ahead and scans the partitions (drop/add
> partitions)
>
> if (disk_openers(lo->lo_disk) > 1) {
> lo->lo_flags |= LO_FLAGS_AUTOCLEAR;
> loop_global_unlock(lo, true);
> return 0;
> }
>
> 4. Before scanning partitions, it will check to see if any partition of
> the loop device is currently opened
> 5. If any partition is opened, then it will return EBUSY:
>
> if (disk->open_partitions)
> return -EBUSY;
> 6. So, after receiving the "LOOP_CLR_FD" command and just before the above
> check for open_partitions, if any other command
> (like blkid) opens any partition of the loop device, then the partition
> scan will not proceed and EBUSY is returned as shown in above code
> 7. But in "__loop_clr_fd()", this EBUSY error is not propagated
> 8. We have noticed that this is causing the partitions of the loop to
> remain stale even after the loop device is detached resulting in the
> IO errors on the partitions
>
> Fix
> ---
> Re-introduce the lo_open() call to restrict any process to open the loop
> device when its being detached
>
> Test case
> =========
> Test case involves the following two scripts:
>
> script1.sh
> ----------
> while [ 1 ];
> do
> losetup -P -f /home/opt/looptest/test10.img
> blkid /dev/loop0p1
> done
>
> script2.sh
> ----------
> while [ 1 ];
> do
> losetup -d /dev/loop0
> done
>
> Without fix, the following IO errors have been observed:
>
> kernel: __loop_clr_fd: partition scan of loop0 failed (rc=-16)
> kernel: I/O error, dev loop0, sector 20971392 op 0x0:(READ) flags 0x80700
> phys_seg 1 prio class 0
> kernel: I/O error, dev loop0, sector 108868 op 0x0:(READ) flags 0x0
> phys_seg 1 prio class 0
> kernel: Buffer I/O error on dev loop0p1, logical block 27201, async page
> read
>
> V1->V2:
> Added a test case, 010, in blktests in tests/loop/
> Signed-off-by: Gulam Mohamed <gulam.mohamed@xxxxxxxxxx>
> ---
> drivers/block/loop.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/drivers/block/loop.c b/drivers/block/loop.c
> index 28a95fd366fe..9a235d8c062d 100644
> --- a/drivers/block/loop.c
> +++ b/drivers/block/loop.c
> @@ -1717,6 +1717,24 @@ static int lo_compat_ioctl(struct block_device *bdev, blk_mode_t mode,
> }
> #endif
>
> +static int lo_open(struct gendisk *disk, blk_mode_t mode)
> +{
> + struct loop_device *lo = disk->private_data;
> + int err;
> +
> + if (!lo)
> + return -ENXIO;
> +
> + err = mutex_lock_killable(&lo->lo_mutex);
> + if (err)
> + return err;
> +
> + if (lo->lo_state == Lo_rundown)
> + err = -ENXIO;
> + mutex_unlock(&lo->lo_mutex);
> + return err;
> +}

Most of this function uses spaces rather than tabs.

--
Jens Axboe