Re: [PATCH RFC] block: fix bio merge checks when virt_boundary is set

From: Ming Lei
Date: Wed Mar 30 2016 - 09:07:28 EST

On Fri, Mar 18, 2016 at 10:59 AM, Ming Lei <tom.leiming@xxxxxxxxx> wrote:
> On Fri, Mar 18, 2016 at 12:39 AM, Keith Busch <keith.busch@xxxxxxxxx> wrote:
>> On Thu, Mar 17, 2016 at 12:20:28PM +0100, Vitaly Kuznetsov wrote:
>>> Keith Busch <keith.busch@xxxxxxxxx> writes:
>>> > been combined. In any case, I think you can get what you're after just
>>> > by moving the gap check after BIOVEC_PHYS_MERGABLE. Does the following
>>> > look ok to you?
>>> >
>>> Thanks, it does.
>> Cool, thanks for confirming.
>>> Will you send it or would you like me to do that with your Suggested-by?
>> I'm not confident yet this doesn't break anything, particularly since
>> we moved the gap check after the length check. Just wanted to confirm
>> the concept addressed your concern, but still need to take a closer look
>> and test before submitting.
> IMO, the change on blk_bio_segment_split() is correct, because actually it
> is a sg gap and the check should have been done between segments
> instead of bvecs. So it is reasonable to move the check just before populating
> a new segment.

Thinking of the 1st part change further, looks it is just correct in concept,
but wrong from current implementation. Because of bios/reqs merge,
blk_rq_map_sg() may end one segment in any bvec in theroy, so I guess
that is why each non-1st bvec need the check to make sure no sg gap.
Looks a very crazy limit, :-)

> But for the 2nd change in bio_will_gap(), which should fix Vitaly's problem, I
> am still not sure if it is completely correct. bio_will_gap() is used
> to check if two
> bios may be merged. Suppose two bios are continues physically, the last bvec
> in 1st bio and the first bvec in 2nd bio might not be in one same segment
> because of segment size limit.

How about the attached patch?

> The root cause might be from blkdev_writepage(), and I guess these small
> bios are from there.
> thanks,
> Ming Lei

Ming Lei
From 5f60ae1d686f025445fdf09f546d4d055d255ce9 Mon Sep 17 00:00:00 2001
From: Ming Lei <ming.lei@xxxxxxxxxxxxx>
Date: Fri, 18 Mar 2016 12:41:53 +0800
Subject: [PATCH] block: loose check on sg gap

If the last bvec of the 1st bio and the 1st bvec of the next
bio are contineous physically, and the latter can be merged
to last segment of the 1st bio, we should think they don't
violate sg gap(or virt boundary) limit.

Vitaly reported lots of unmergeable small bios are observed
when running mkfs.ntfs on Hyper-V virtual storage, and performance
becomes quite low, so this patch is figured out for fix the
performance issue.

Reported-by: Vitaly Kuznetsov <vkuznets@xxxxxxxxxx>
Cc: Keith Busch <keith.busch@xxxxxxxxx>
Signed-off-by: Ming Lei <ming.lei@xxxxxxxxxxxxx>
include/linux/blkdev.h | 22 +++++++++++++++++++++-
1 file changed, 21 insertions(+), 1 deletion(-)

diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 7e5d7e0..3962527 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1394,6 +1394,25 @@ static inline bool bvec_gap_to_prev(struct request_queue *q,
return __bvec_gap_to_prev(q, bprv, offset);

+ * Check if the two bvecs from two bios can be merged to one segment.
+ * If yes, no need to check gap between the two bios since the 1st bio
+ * and the 1st bvec in the 2nd bio can be handled in one segment.
+ */
+static inline bool bios_segs_mergeable(struct request_queue *q,
+ struct bio *prev, struct bio_vec *prev_last_bv,
+ struct bio_vec *next_first_bv)
+ if (!BIOVEC_PHYS_MERGEABLE(prev_last_bv, next_first_bv))
+ return false;
+ if (!BIOVEC_SEG_BOUNDARY(q, prev_last_bv, next_first_bv))
+ return false;
+ if (prev->bi_seg_back_size + next_first_bv->bv_len >
+ queue_max_segment_size(q))
+ return false;
+ return true;
static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
struct bio *next)
@@ -1403,7 +1422,8 @@ static inline bool bio_will_gap(struct request_queue *q, struct bio *prev,
bio_get_last_bvec(prev, &pb);
bio_get_first_bvec(next, &nb);

- return __bvec_gap_to_prev(q, &pb, nb.bv_offset);
+ if (!bios_segs_mergeable(q, prev, &pb, &nb))
+ return __bvec_gap_to_prev(q, &pb, nb.bv_offset);

return false;