[PATCH v3] block: fix trace completion for chained bio

From: edwardh
Date: Tue Jun 15 2021 - 23:08:52 EST


From: Edward Hsieh <edwardh@xxxxxxxxxxxx>

For chained bio, trace_block_bio_complete in bio_endio is currently called
only by the parent bio once upon all chained bio completed.
However, the sector and size for the parent bio are modified in bio_split.
Therefore, the size and sector of the complete events might not match the
queue events in blktrace.

The original fix of bio completion trace <fbbaf700e7b1> ("block: trace
completion of all bios.") wants multiple complete events to correspond
to one queue event but missed this.

The issue can be reproduced by md/raid5 read with bio cross chunks.

To fix, move trace completion into the loop for every chained bio to call.

To make sense of the tail call optimization, the bio_chained_endio
handing should be at the end of bio_endio(), move blk_throtl_bio_endio()
and bio_uninit() to the else closure of bio_chain_endio condition.
blk_throtl_bio_endio updates latency for the throttle group, and
considering the throttle group tracks the latency of the current device,
we should record the latency after all chained bio finish. And since the
resources of the chained bio are released by bio_put() in
__bio_chain_endio(), calling bio_uninit() for chained bio is not necessary.

Fixes: fbbaf700e7b1 ("block: trace completion of all bios.")
Reviewed-by: Wade Liang <wadel@xxxxxxxxxxxx>
Reviewed-by: BingJing Chang <bingjingc@xxxxxxxxxxxx>
Signed-off-by: Edward Hsieh <edwardh@xxxxxxxxxxxx>
---
block/bio.c | 25 ++++++++++++-------------
1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/block/bio.c b/block/bio.c
index 44205dfb6b60..dcb23e75f083 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1375,8 +1375,7 @@ static inline bool bio_remaining_done(struct bio *bio)
*
* bio_endio() can be called several times on a bio that has been chained
* using bio_chain(). The ->bi_end_io() function will only be called the
- * last time. At this point the BLK_TA_COMPLETE tracing event will be
- * generated if BIO_TRACE_COMPLETION is set.
+ * last time.
**/
void bio_endio(struct bio *bio)
{
@@ -1389,6 +1388,11 @@ void bio_endio(struct bio *bio)
if (bio->bi_bdev)
rq_qos_done_bio(bio->bi_bdev->bd_disk->queue, bio);

+ if (bio->bi_bdev && bio_flagged(bio, BIO_TRACE_COMPLETION)) {
+ trace_block_bio_complete(bio->bi_bdev->bd_disk->queue, bio);
+ bio_clear_flag(bio, BIO_TRACE_COMPLETION);
+ }
+
/*
* Need to have a real endio function for chained bios, otherwise
* various corner cases will break (like stacking block devices that
@@ -1400,18 +1404,13 @@ void bio_endio(struct bio *bio)
if (bio->bi_end_io == bio_chain_endio) {
bio = __bio_chain_endio(bio);
goto again;
+ } else {
+ blk_throtl_bio_endio(bio);
+ /* release cgroup info */
+ bio_uninit(bio);
+ if (bio->bi_end_io)
+ bio->bi_end_io(bio);
}
-
- if (bio->bi_bdev && bio_flagged(bio, BIO_TRACE_COMPLETION)) {
- trace_block_bio_complete(bio->bi_bdev->bd_disk->queue, bio);
- bio_clear_flag(bio, BIO_TRACE_COMPLETION);
- }
-
- blk_throtl_bio_endio(bio);
- /* release cgroup info */
- bio_uninit(bio);
- if (bio->bi_end_io)
- bio->bi_end_io(bio);
}
EXPORT_SYMBOL(bio_endio);

--
2.31.1