Re: [PATCH -next] block: fix bio lost for plug enabeld bio based device

From: Changhui Zhong
Date: Tue May 21 2024 - 20:38:48 EST


On Tue, May 21, 2024 at 8:10 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
>
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
>
> With the following two conditions, bio will be lost:
>
> 1) blk plug is not enabled, for example, __blkdev_direct_IO_simple() and
> __blkdev_direct_IO_async();
> 2) bio plug is enabled, for example write IO for raid1/raid10 while
> bitmap is enabled;
>
> Root cause is that blk_finish_plug() will add the bio to
> curent->bio_list, while such bio will not be handled:
>
> __submit_bio_noacct
> current->bio_list = bio_list_on_stack;
> blk_start_plug
>
> do {
> dm_submit_bio
> md_handle_request
> raid10_write_request
> -> generate new bio for underlying disks
> raid1_add_bio_to_plug -> bio is added to plug
> } while ((bio = bio_list_pop(&bio_list_on_stack[0])))
> -> previous bio are all handled
>
> blk_finish_plug
> raid10_unplug
> raid1_submit_write
> submit_bio_noacct
> if (current->bio_list)
> bio_list_add(&current->bio_list[0], bio)
> -> add new bio
>
> current->bio_list = NULL
> -> new bio is lost
>
> Fix the problem by moving plug into the while loop, so that
> current->bio_list will still be handled after blk_finish_plug().
>
> By the way, enable plug for raid1/raid10 in this case will also prevent
> delay IO handling into daemon thread, which should also improve IO
> performance.
>
> Fixes: 060406c61c7c ("block: add plug while submitting IO")
> Reported-by: Changhui Zhong <czhong@xxxxxxxxxx>
> Closes: https://lore.kernel.org/all/CAGVVp+Xsmzy2G9YuEatfMT6qv1M--YdOCQ0g7z7OVmcTbBxQAg@xxxxxxxxxxxxxx/
> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
> ---
> block/blk-core.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 01186333c88e..dd29d5465af6 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -613,9 +613,14 @@ static inline blk_status_t blk_check_zone_append(struct request_queue *q,
>
> static void __submit_bio(struct bio *bio)
> {
> + /* If plug is not used, add new plug here to cache nsecs time. */
> + struct blk_plug plug;
> +
> if (unlikely(!blk_crypto_bio_prep(&bio)))
> return;
>
> + blk_start_plug(&plug);
> +
> if (!bio->bi_bdev->bd_has_submit_bio) {
> blk_mq_submit_bio(bio);
> } else if (likely(bio_queue_enter(bio) == 0)) {
> @@ -624,6 +629,8 @@ static void __submit_bio(struct bio *bio)
> disk->fops->submit_bio(bio);
> blk_queue_exit(disk->queue);
> }
> +
> + blk_finish_plug(&plug);
> }
>
> /*
> @@ -648,13 +655,11 @@ static void __submit_bio(struct bio *bio)
> static void __submit_bio_noacct(struct bio *bio)
> {
> struct bio_list bio_list_on_stack[2];
> - struct blk_plug plug;
>
> BUG_ON(bio->bi_next);
>
> bio_list_init(&bio_list_on_stack[0]);
> current->bio_list = bio_list_on_stack;
> - blk_start_plug(&plug);
>
> do {
> struct request_queue *q = bdev_get_queue(bio->bi_bdev);
> @@ -688,23 +693,19 @@ static void __submit_bio_noacct(struct bio *bio)
> bio_list_merge(&bio_list_on_stack[0], &bio_list_on_stack[1]);
> } while ((bio = bio_list_pop(&bio_list_on_stack[0])));
>
> - blk_finish_plug(&plug);
> current->bio_list = NULL;
> }
>
> static void __submit_bio_noacct_mq(struct bio *bio)
> {
> struct bio_list bio_list[2] = { };
> - struct blk_plug plug;
>
> current->bio_list = bio_list;
> - blk_start_plug(&plug);
>
> do {
> __submit_bio(bio);
> } while ((bio = bio_list_pop(&bio_list[0])));
>
> - blk_finish_plug(&plug);
> current->bio_list = NULL;
> }
>
> --
> 2.39.2
>

Hi, Kuai

the raid1 and raid10 issue has been fixed by this patch,
please feel free to add:

Tested-by: Changhui Zhong <czhong@xxxxxxxxxx>

Thanks,