Re: [RFC PATCH] blk: reset 'bi_next' when bio is done inside request

From: NeilBrown
Date: Mon Apr 03 2017 - 17:26:00 EST


On Mon, Apr 03 2017, Michael Wang wrote:

> blk_attempt_plug_merge() try to merge bio into request and chain them
> by 'bi_next', while after the bio is done inside request, we forgot to
> reset the 'bi_next'.
>
> This lead into BUG while removing all the underlying devices from md-raid1,
> the bio once go through:
>
> md_do_sync()
> sync_request()
> generic_make_request()

This is a read request from the "first" device.

> blk_queue_bio()
> blk_attempt_plug_merge()
> CHAINED HERE
>
> will keep chained and reused by:
>
> raid1d()
> sync_request_write()
> generic_make_request()

This is a write request to some other device, isn't it?

If sync_request_write() is using a bio that has already been used, it
should call bio_reset() and fill in the details again.
However I don't see how that would happen.
Can you give specific details on the situation that triggers the bug?

Thanks,
NeilBrown


> BUG_ON(bio->bi_next)
>
> After reset the 'bi_next' this can no longer happen.
>
> Signed-off-by: Michael Wang <yun.wang@xxxxxxxxxxxxxxxx>
> ---
> block/blk-core.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 43b7d06..91223b2 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -2619,8 +2619,10 @@ bool blk_update_request(struct request *req, int error, unsigned int nr_bytes)
> struct bio *bio = req->bio;
> unsigned bio_bytes = min(bio->bi_iter.bi_size, nr_bytes);
>
> - if (bio_bytes == bio->bi_iter.bi_size)
> + if (bio_bytes == bio->bi_iter.bi_size) {
> req->bio = bio->bi_next;
> + bio->bi_next = NULL;
> + }
>
> req_bio_endio(req, bio, bio_bytes, error);
>
> --
> 2.5.0

Attachment: signature.asc
Description: PGP signature