We need a new branch in read_balance() to choose a rdev with full copy.
Sure, I do realize that the mirror'ing personalities need more sophisticated error handling changes (than what I presented).
However, in raid1_read_request() we do the read_balance() and then the bio_split() attempt. So what are you suggesting we do for the bio_split() error? Is it to retry without the bio_split()?
To me bio_split() should not fail. If it does, it is likely ENOMEM or some other bug being exposed, so I am not sure that retrying with skipping bio_split() is the right approach (if that is what you are suggesting).
bio_split_to_limits() is already called from md_submit_bio(), so here
bio should only be splitted because of badblocks or resync. We have to
return error for resync, however, for badblocks, we can still try to
find a rdev without badblocks so bio_split() is not needed. And we need
to retry and inform read_balance() to skip rdev with badblocks in this
case.
This can only happen if the full copy only exist in slow disks. This
really is corner case, and this is not related to your new error path by
atomic write. I don't mind this version for now, just something
I noticed if bio_spilit() can fail.