Re: [PATCH 4/4] ublk: improve handling of saturated queues when ublk server exits

From: Ming Lei
Date: Wed Mar 26 2025 - 22:06:43 EST


On Tue, Mar 25, 2025 at 04:19:34PM -0600, Uday Shankar wrote:
> There are currently two ways in which ublk server exit is detected by
> ublk_drv:
>
> 1. uring_cmd cancellation. If there are any outstanding uring_cmds which
> have not been completed to the ublk server when it exits, io_uring
> calls the uring_cmd callback with a special cancellation flag as the
> issuing task is exiting.
> 2. I/O timeout. This is needed in addition to the above to handle the
> "saturated queue" case, when all I/Os for a given queue are in the
> ublk server, and therefore there are no outstanding uring_cmds to
> cancel when the ublk server exits.
>
> The second method detects ublk server exit only after a long delay
> (~30s, the default timeout assigned by the block layer). Any
> applications using the ublk device will be left hanging for these 30s
> before seeing an error/knowing anything went wrong. This problem is
> illustrated by running the new test_generic_02 against a ublk_drv which
> doesn't have the fix:
>
> selftests: ublk: test_generic_02.sh
> dev id is 0
> dd: error writing '/dev/ublkb0': Input/output error
> 1+0 records in
> 0+0 records out
> 0 bytes copied, 30.0611 s, 0.0 kB/s
> DEAD
> dd took 31 seconds to exit (>= 5s tolerance)!
> generic_02 : [FAIL]
>
> Fix this by instead handling the saturated queue case in the ublk
> character file release callback. This happens during ublk server exit
> and handles the issue much more quickly than an I/O timeout:
>
> selftests: ublk: test_generic_02.sh
> dev id is 0
> dd: error writing '/dev/ublkb0': Input/output error
> 1+0 records in
> 0+0 records out
> 0 bytes copied, 0.0376731 s, 0.0 kB/s
> DEAD
> generic_02 : [PASS]
>
> Signed-off-by: Uday Shankar <ushankar@xxxxxxxxxxxxxxx>
> ---
> drivers/block/ublk_drv.c | 40 +++++++++++------------
> tools/testing/selftests/ublk/Makefile | 1 +
> tools/testing/selftests/ublk/kublk.c | 3 ++
> tools/testing/selftests/ublk/kublk.h | 3 ++
> tools/testing/selftests/ublk/null.c | 4 +++
> tools/testing/selftests/ublk/test_generic_02.sh | 43 +++++++++++++++++++++++++
> 6 files changed, 72 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
> index c060da409ed8a888b7e414c9065efd2cbd6d57d7..1816b2cac01056dc9d01455759594af43c5f78d6 100644
> --- a/drivers/block/ublk_drv.c
> +++ b/drivers/block/ublk_drv.c
> @@ -1247,8 +1247,6 @@ static void ublk_queue_cmd(struct ublk_queue *ubq, struct request *rq)
> static enum blk_eh_timer_return ublk_timeout(struct request *rq)
> {
> struct ublk_queue *ubq = rq->mq_hctx->driver_data;
> - unsigned int nr_inflight = 0;
> - int i;
>
> if (ubq->flags & UBLK_F_UNPRIVILEGED_DEV) {
> if (!ubq->timeout) {
> @@ -1259,26 +1257,6 @@ static enum blk_eh_timer_return ublk_timeout(struct request *rq)
> return BLK_EH_DONE;
> }
>
> - if (!ubq_daemon_is_dying(ubq))
> - return BLK_EH_RESET_TIMER;
> -
> - for (i = 0; i < ubq->q_depth; i++) {
> - struct ublk_io *io = &ubq->ios[i];
> -
> - if (!(io->flags & UBLK_IO_FLAG_ACTIVE))
> - nr_inflight++;
> - }
> -
> - /* cancelable uring_cmd can't help us if all commands are in-flight */
> - if (nr_inflight == ubq->q_depth) {
> - struct ublk_device *ub = ubq->dev;
> -
> - if (ublk_abort_requests(ub, ubq)) {
> - schedule_work(&ub->nosrv_work);
> - }
> - return BLK_EH_DONE;
> - }
> -
> return BLK_EH_RESET_TIMER;
> }
>
> @@ -1351,6 +1329,24 @@ static int ublk_ch_open(struct inode *inode, struct file *filp)
> static int ublk_ch_release(struct inode *inode, struct file *filp)
> {
> struct ublk_device *ub = filp->private_data;
> + bool need_schedule = false;
> + int i;
> +
> + /*
> + * Error out any requests outstanding to the ublk server. This
> + * may have happened already (via uring_cmd cancellation), in
> + * which case it is not harmful to repeat. But uring_cmd
> + * cancellation does not handle queues which are fully saturated
> + * (all requests in ublk server), because from the kernel's POV,
> + * there are no outstanding uring_cmds to cancel. This code
> + * handles such queues.
> + */
> +
> + for (i = 0; i < ub->dev_info.nr_hw_queues; i++)
> + need_schedule |= ublk_abort_requests(ub, ublk_get_queue(ub, i));
> +
> + if (need_schedule)
> + schedule_work(&ub->nosrv_work);

Thinking of it further, you needn't to call ublk_abort_requests() and schedule
stop work to remove disk here.

It can be the following way:

1) do nothing if 'ubq->canceling' is set, and it is safe to check this flag
because all uring commands are done now

2) otherwise, abort any in-flight request only by calling
ublk_abort_requests() and skipping the get & set ->canceling part.

which should avoid the 30sec delay, right?

Thanks,
Ming