Re: 10x I/O await times in 6.12

From: Matt Fleming

Date: Thu Apr 23 2026 - 14:24:39 EST


Hi Yu and Jens,

On Mon, Apr 21, 2025 at 08:21:13PM +0800, Yu Kuai wrote:
> If nsecs - plug->cur_ktime is still milliseconds, can you check if
> the following patch can fix your problem?
>
> Thanks,
> Kuai
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index ae8494d88897..37197502147e 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1095,7 +1095,9 @@ static inline void blk_account_io_start(struct request
> *req)
> return;
>
> req->rq_flags |= RQF_IO_STAT;
> - req->start_time_ns = blk_time_get_ns();
> +
> + if (!current->plug)
> + req->start_time_ns = blk_time_get_ns();
>
> /*
> * All non-passthrough requests are created from a bio with one
> @@ -2874,6 +2876,7 @@ void blk_mq_flush_plug_list(struct blk_plug *plug,
> bool from_schedule)
> {
> struct request *rq;
> unsigned int depth;
> + u64 now;
>
> /*
> * We may have been called recursively midway through handling
> @@ -2887,6 +2890,10 @@ void blk_mq_flush_plug_list(struct blk_plug *plug,
> bool from_schedule)
> depth = plug->rq_count;
> plug->rq_count = 0;
>
> + now = ktime_get_ns();
> + rq_list_for_each(&plug->mq_list, rq)
> + rq->start_time_ns = now;
> +
> if (!plug->multiple_queues && !plug->has_elevator && !from_schedule)
> {
> struct request_queue *q;
>

It's April again, and this time we're moving to 6.18.

The I/O accounting bug reported at the start of this thread is still an
issue for us, and we've been carrying the above patch for 12 months.

Digging into this a bit more, the issue appears to be big writeback
(e.g. xfsaild invoking xfs_buf_delwri_submit_nowait()), where one plug
covers hundreds of writes for around 10ms on average.

xfsaild can stay on-CPU for long stretches which leads to inflated I/O
times in /proc/diskstats. Tracing the requests shows that they're not
actually any slower than previous kernel versions -- it's purely an
accounting bug.

Of course it doesn't look to be an XFS thing. It's just tasks doing
massive writes without getting preempted/interrupted.

Any interest in merging the above patch?

Thanks,
Matt