Re: [PATCH v5 3/5] io_uring: count CQEs in io_iopoll_check()

From: Ming Lei

Date: Wed Mar 04 2026 - 05:33:11 EST


On Mon, Mar 02, 2026 at 10:29:12AM -0700, Caleb Sander Mateos wrote:
> A subsequent commit will allow uring_cmds that don't use iopoll on
> IORING_SETUP_IOPOLL io_urings. As a result, CQEs can be posted without
> setting the iopoll_completed flag for a request in iopoll_list or going
> through task work. For example, a UBLK_U_IO_FETCH_IO_CMDS command could
> call io_uring_mshot_cmd_post_cqe() to directly post a CQE. The
> io_iopoll_check() loop currently only counts completions posted in
> io_do_iopoll() when determining whether the min_events threshold has
> been met. It also exits early if there are any existing CQEs before
> polling, or if any CQEs are posted while running task work. CQEs posted
> via io_uring_mshot_cmd_post_cqe() or other mechanisms won't be counted
> against min_events.
>
> Explicitly check the available CQEs in each io_iopoll_check() loop
> iteration to account for CQEs posted in any fashion.
>
> Signed-off-by: Caleb Sander Mateos <csander@xxxxxxxxxxxxxxx>
> ---
> io_uring/io_uring.c | 9 ++-------
> 1 file changed, 2 insertions(+), 7 deletions(-)
>
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index 46f39831d27c..b4625695bb3a 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -1184,11 +1184,10 @@ __cold void io_iopoll_try_reap_events(struct io_ring_ctx *ctx)
> io_move_task_work_from_local(ctx);
> }
>
> static int io_iopoll_check(struct io_ring_ctx *ctx, unsigned int min_events)
> {
> - unsigned int nr_events = 0;
> unsigned long check_cq;
>
> min_events = min(min_events, ctx->cq_entries);
>
> lockdep_assert_held(&ctx->uring_lock);
> @@ -1227,34 +1226,30 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, unsigned int min_events)
> * the poll to the issued list. Otherwise we can spin here
> * forever, while the workqueue is stuck trying to acquire the
> * very same mutex.
> */
> if (list_empty(&ctx->iopoll_list) || io_task_work_pending(ctx)) {
> - u32 tail = ctx->cached_cq_tail;
> -
> (void) io_run_local_work_locked(ctx, min_events);
>
> if (task_work_pending(current) || list_empty(&ctx->iopoll_list)) {
> mutex_unlock(&ctx->uring_lock);
> io_run_task_work();
> mutex_lock(&ctx->uring_lock);
> }
> /* some requests don't go through iopoll_list */
> - if (tail != ctx->cached_cq_tail || list_empty(&ctx->iopoll_list))
> + if (list_empty(&ctx->iopoll_list))
> break;
> }
> ret = io_do_iopoll(ctx, !min_events);
> if (unlikely(ret < 0))
> return ret;
>
> if (task_sigpending(current))
> return -EINTR;
> if (need_resched())
> break;
> -
> - nr_events += ret;
> - } while (nr_events < min_events);
> + } while (io_cqring_events(ctx) < min_events);

Before entering the loop, if io_cqring_events() finds any queued CQE,
io_iopoll_check() returns immediately without polling.

If the queued CQE is originated from non-iopoll uring_cmd, iopoll request
will not be polled, may this be one issue?


Thanks,
Ming