Re: [PATCH v4 3/5] io_uring: count CQEs in io_iopoll_check()

From: Ming Lei

Date: Sat Feb 28 2026 - 04:46:22 EST


On Fri, Feb 27, 2026 at 03:35:01PM -0700, Caleb Sander Mateos wrote:
> A subsequent commit will allow uring_cmds that don't use iopoll on
> IORING_SETUP_IOPOLL io_urings. As a result, CQEs can be posted without
> setting the iopoll_completed flag for a request in iopoll_list or going
> through task work. For example, a UBLK_U_IO_FETCH_IO_CMDS command could
> call io_uring_mshot_cmd_post_cqe() to directly post a CQE. The
> io_iopoll_check() loop currently only counts completions posted in
> io_do_iopoll() when determining whether the min_events threshold has
> been met. It also exits early if there are any existing CQEs before
> polling, or if any CQEs are posted while running task work. CQEs posted
> via io_uring_mshot_cmd_post_cqe() or other mechanisms won't be counted
> against min_events.
>
> Explicitly check the available CQEs in each io_iopoll_check() loop
> iteration to account for CQEs posted in any fashion.
>
> Signed-off-by: Caleb Sander Mateos <csander@xxxxxxxxxxxxxxx>
> ---
> io_uring/io_uring.c | 18 +++---------------
> 1 file changed, 3 insertions(+), 15 deletions(-)
>
> diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
> index 46f39831d27c..5f694052f501 100644
> --- a/io_uring/io_uring.c
> +++ b/io_uring/io_uring.c
> @@ -1184,11 +1184,10 @@ __cold void io_iopoll_try_reap_events(struct io_ring_ctx *ctx)
> io_move_task_work_from_local(ctx);
> }
>
> static int io_iopoll_check(struct io_ring_ctx *ctx, unsigned int min_events)
> {
> - unsigned int nr_events = 0;
> unsigned long check_cq;
>
> min_events = min(min_events, ctx->cq_entries);
>
> lockdep_assert_held(&ctx->uring_lock);
> @@ -1205,19 +1204,12 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, unsigned int min_events)
> * dropped CQE.
> */
> if (check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))
> return -EBADR;
> }
> - /*
> - * Don't enter poll loop if we already have events pending.
> - * If we do, we can potentially be spinning for commands that
> - * already triggered a CQE (eg in error).
> - */
> - if (io_cqring_events(ctx))
> - return 0;
>
> - do {
> + while (io_cqring_events(ctx) < min_events) {

It may not handle zero `min_events` correctly, please see AI review result:

https://netdev-ai.bots.linux.dev/ai-review.html?id=6977b6d6-04e4-4990-a96f-b7580fc5acc4

Thanks,
Ming