[PATCH 3.16 272/410] md raid10: fix NULL deference in handle_write_completed()

From: Ben Hutchings
Date: Thu Jun 07 2018 - 10:29:35 EST


3.16.57-rc1 review patch. If anyone has any objections, please let me know.

------------------

From: Yufen Yu <yuyufen@xxxxxxxxxx>

commit 01a69cab01c184d3786af09e9339311123d63d22 upstream.

In the case of 'recover', an r10bio with R10BIO_WriteError &
R10BIO_IsRecover will be progressed by handle_write_completed().
This function traverses all r10bio->devs[copies].
If devs[m].repl_bio != NULL, it thinks conf->mirrors[dev].replacement
is also not NULL. However, this is not always true.

When there is an rdev of raid10 has replacement, then each r10bio
->devs[m].repl_bio != NULL in conf->r10buf_pool. However, in 'recover',
even if corresponded replacement is NULL, it doesn't clear r10bio
->devs[m].repl_bio, resulting in replacement NULL deference.

This bug was introduced when replacement support for raid10 was
added in Linux 3.3.

As NeilBrown suggested:
Elsewhere the determination of "is this device part of the
resync/recovery" is made by resting bio->bi_end_io.
If this is end_sync_write, then we tried to write here.
If it is NULL, then we didn't try to write.

Fixes: 9ad1aefc8ae8 ("md/raid10: Handle replacement devices during resync.")
Cc: stable (V3.3+)
Suggested-by: NeilBrown <neilb@xxxxxxxx>
Signed-off-by: Yufen Yu <yuyufen@xxxxxxxxxx>
Signed-off-by: Shaohua Li <sh.li@xxxxxxxxxxxxxxx>
Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
---
drivers/md/raid10.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -2746,7 +2746,8 @@ static void handle_write_completed(struc
for (m = 0; m < conf->copies; m++) {
int dev = r10_bio->devs[m].devnum;
rdev = conf->mirrors[dev].rdev;
- if (r10_bio->devs[m].bio == NULL)
+ if (r10_bio->devs[m].bio == NULL ||
+ r10_bio->devs[m].bio->bi_end_io == NULL)
continue;
if (test_bit(BIO_UPTODATE,
&r10_bio->devs[m].bio->bi_flags)) {
@@ -2762,7 +2763,8 @@ static void handle_write_completed(struc
md_error(conf->mddev, rdev);
}
rdev = conf->mirrors[dev].replacement;
- if (r10_bio->devs[m].repl_bio == NULL)
+ if (r10_bio->devs[m].repl_bio == NULL ||
+ r10_bio->devs[m].repl_bio->bi_end_io == NULL)
continue;
if (test_bit(BIO_UPTODATE,
&r10_bio->devs[m].repl_bio->bi_flags)) {