[PATCH 0/4] md: fix is_mddev_idle()

From: Yu Kuai
Date: Sat Apr 12 2025 - 03:39:17 EST


From: Yu Kuai <yukuai3@xxxxxxxxxx>

If sync_speed is above speed_min, then is_mddev_idle() will be called
for each sync IO to check if the array is idle, and inflihgt sync_io
will be limited if the array is not idle.

However, while mkfs.ext4 for a large raid5 array while recovery is in
progress, it's found that sync_speed is already above speed_min while
lots of stripes are used for sync IO, causing long delay for mkfs.ext4.

Root cause is the following checking from is_mddev_idle():

t1: submit sync IO: events1 = completed IO - issued sync IO
t2: submit next sync IO: events2 = completed IO - issued sync IO
if (events2 - events1 > 64)

For consequence, the more sync IO issued, the less likely checking will
pass. And when completed normal IO is more than issued sync IO, the
condition will finally pass and is_mddev_idle() will return false,
however, last_events will be updated hence is_mddev_idle() can only
return false once in a while.

Fix this problem by changing the checking as following:

1) mddev doesn't have normal IO completed;
2) mddev doesn't have normal IO inflight;
3) if any member disks is partition, and all other partitions doesn't
have IO completed.

Yu Kuai (4):
block: export part_in_flight()
md: add a new api sync_io_depth
md: fix is_mddev_idle()
md: cleanup accounting for issued sync IO

block/blk.h | 1 -
block/genhd.c | 1 +
drivers/md/md.c | 181 ++++++++++++++++++++++++++------------
drivers/md/md.h | 15 +---
drivers/md/raid1.c | 3 -
drivers/md/raid10.c | 9 --
drivers/md/raid5.c | 8 --
include/linux/blkdev.h | 1 -
include/linux/part_stat.h | 1 +
9 files changed, 130 insertions(+), 90 deletions(-)

--
2.39.2