Re: [REGRESSION][BISECTED] Spurious raid1 device failure triggered by qemu direct IO on 6.18+

From: Tomáš Trnka

Date: Thu Apr 16 2026 - 06:13:26 EST


The proposed patch fixes the issue for me and doesn't cause any visible issues
(in normal operation, I didn't test actual device failure). Thanks a lot for
such a speedy fix, looking forward for the final version to test as well.

> ---
> diff --git a/drivers/md/raid1-10.c b/drivers/md/raid1-10.c
> index c33099925f230..cf1c25f290f36 100644
> --- a/drivers/md/raid1-10.c
> +++ b/drivers/md/raid1-10.c
> @@ -293,8 +293,16 @@ static inline bool raid1_should_read_first(struct mddev
> *mddev, * bio with REQ_RAHEAD or REQ_NOWAIT can fail at anytime, before
> such IO is * submitted to the underlying disks, hence don't record
> badblocks or retry * in this case.
> + *
> + * BLK_STS_INVAL means the request itself is malformed (e.g. unaligned
> + * buffers that violate DMA constraints). Retrying on another mirror will
> + * fail the same way, and counting it against the device is wrong.
> */
> static inline bool raid1_should_handle_error(struct bio *bio)
> {
> - return !(bio->bi_opf & (REQ_RAHEAD | REQ_NOWAIT));
> + if (bio->bi_opf & (REQ_RAHEAD | REQ_NOWAIT))
> + return false;
> + if (bio->bi_status == BLK_STS_INVAL)
> + return false;
> + return true;
> }
> --