Re: [PATCH 2/7] md/raid1: handle atomic writes that require splitting

From: John Garry

Date: Tue Jun 23 2026 - 04:13:30 EST

On 23/06/2026 08:24, Abd-Alrhman Masalkhi wrote:

If a request already requires splitting when entering
raid1_write_request(), the current code allows it to proceed until it
eventually reaches the split path.

The block layer should catch invalid atomic writes in submit_bio_noacct() -> blk_validate_atomic_write_op_size() before we even get as far as the md atomic write handling. Having the check in bio_submit_split_bioset() is really just a fail-safe for the block layer not catching invalid atomic writes or the atomic writes queue limits not being properly calculated.

Along the way, the bio may instead
fail due to other conditions and return a different status, even though
the request was invalid as an atomic write from the beginning.

Additionally, an otherwise valid atomic write may later require
splitting because bad blocks reduce the writable range or because
write-behind constraints reduce the maximum writable size. In these
cases, the bio currently completes with either EINVAL or ENOTSUPP,
whereas it should complete with EIO instead.

Fixes: f2a38abf5f1c ("md/raid1: Atomic write support")
Fixes: a4c55c902670 ("md/raid1: simplify raid1_write_request() error handling")
Signed-off-by: Abd-Alrhman Masalkhi <abd.masalkhi@xxxxxxxxx>
---
drivers/md/raid1.c | 25 +++++++++++--------------
1 file changed, 11 insertions(+), 14 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 86d4f224ffb1..8386d37343a4 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1511,9 +1511,15 @@ static bool raid1_write_request(struct mddev *mddev, struct bio *bio,
int first_clone;
bool write_behind = false;
bool nowait = bio->bi_opf & REQ_NOWAIT;
+ bool atomic = bio->bi_opf & REQ_ATOMIC;
bool is_discard = op_is_discard(bio->bi_opf);
sector_t sector = bio->bi_iter.bi_sector;
+ if (atomic && max_sectors != bio_sectors(bio)) {
+ bio_endio_status(bio, BLK_STS_INVAL);
+ return false;
+ }
+
if (mddev_is_clustered(mddev) &&
mddev->cluster_ops->area_resyncing(mddev, WRITE, sector,
bio_end_sector(bio))) {
@@ -1592,20 +1598,6 @@ static bool raid1_write_request(struct mddev *mddev, struct bio *bio,
}
if (is_bad) {
int good_sectors;
-
- /*
- * We cannot atomically write this, so just
- * error in that case. It could be possible to
- * atomically write other mirrors, but the
- * complexity of supporting that is not worth
- * the benefit.
- */
- if (bio->bi_opf & REQ_ATOMIC) {
- bio->bi_status = BLK_STS_NOTSUPP;

what baseline are you using here? This looks different to linux-next 22 june and linus' master branch

- bio_endio(bio);
- goto err_dec_pending;
- }
-
good_sectors = first_bad - sector;
if (good_sectors < max_sectors)
max_sectors = good_sectors;
@@ -1626,6 +1618,11 @@ static bool raid1_write_request(struct mddev *mddev, struct bio *bio,
max_sectors = min_t(int, max_sectors,
BIO_MAX_VECS * (PAGE_SIZE >> 9));
if (max_sectors < bio_sectors(bio)) {
+ if (atomic) {
+ bio_io_error(bio);
+ goto err_dec_pending;
+ }
+
bio = bio_submit_split_bioset(bio, max_sectors,
&conf->bio_split);
if (!bio)