Re: block: be more careful about status in __bio_chain_endio

From: Mike Snitzer
Date: Fri Feb 22 2019 - 18:55:07 EST


On Fri, Feb 22 2019 at 5:46pm -0500,
Jens Axboe <axboe@xxxxxxxxx> wrote:

> On 2/22/19 2:10 PM, Mike Snitzer wrote:
> > On Thu, Feb 15 2018 at 4:09am -0500,
> > NeilBrown <neilb@xxxxxxxx> wrote:
> >
> >>
> >> If two bios are chained under the one parent (with bio_chain())
> >> it is possible that one will succeed and the other will fail.
> >> __bio_chain_endio must ensure that the failure error status
> >> is reported for the whole, rather than the success.
> >>
> >> It currently tries to be careful, but this test is racy.
> >> If both children finish at the same time, they might both see that
> >> parent->bi_status as zero, and so will assign their own status.
> >> If the assignment to parent->bi_status by the successful bio happens
> >> last, the error status will be lost which can lead to silent data
> >> corruption.
> >>
> >> Instead, __bio_chain_endio should only assign a non-zero status
> >> to parent->bi_status. There is then no need to test the current
> >> value of parent->bi_status - a test that would be racy anyway.
> >>
> >> Note that this bug hasn't been seen in practice. It was only discovered
> >> by examination after a similar bug was found in dm.c
> >>
> >> Signed-off-by: NeilBrown <neilb@xxxxxxxx>
> >> ---
> >> block/bio.c | 2 +-
> >> 1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/block/bio.c b/block/bio.c
> >> index e1708db48258..ad77140edc6f 100644
> >> --- a/block/bio.c
> >> +++ b/block/bio.c
> >> @@ -312,7 +312,7 @@ static struct bio *__bio_chain_endio(struct bio *bio)
> >> {
> >> struct bio *parent = bio->bi_private;
> >>
> >> - if (!parent->bi_status)
> >> + if (bio->bi_status)
> >> parent->bi_status = bio->bi_status;
> >> bio_put(bio);
> >> return parent;
> >> --
> >> 2.14.0.rc0.dirty
> >>
> >
> > Reviewed-by: Mike Snitzer <snitzer@xxxxxxxxxx>
> >
> > Jens, this one slipped through the crack just over a year ago.
> > It is available in patchwork here:
> > https://patchwork.kernel.org/patch/10220727/
>
> Should this be:
>
> if (!parent->bi_status && bio->bi_status)
> parent->bi_status = bio->bi_status;
>
> perhaps?

Yeap, even better. Not seeing any reason to have the last error win,
the first in the chain is likely the most important.