Re: [PATCH v2] block: fix trace completion for chained bio

From: Edward Hsieh
Date: Tue May 25 2021 - 05:38:10 EST




On 5/10/2021 10:06 AM, Edward Hsieh wrote:

On 4/23/2021 4:04 PM, Edward Hsieh wrote:
On 3/23/2021 5:22 AM, NeilBrown wrote:
On Wed, Mar 03 2021, edwardh wrote:

From: Edward Hsieh <edwardh@xxxxxxxxxxxx>

For chained bio, trace_block_bio_complete in bio_endio is currently called
only by the parent bio once upon all chained bio completed.
However, the sector and size for the parent bio are modified in bio_split.
Therefore, the size and sector of the complete events might not match the
queue events in blktrace.

The original fix of bio completion trace <fbbaf700e7b1> ("block: trace
completion of all bios.") wants multiple complete events to correspond
to one queue event but missed this.

md/raid5 read with bio cross chunks can reproduce this issue.

To fix, move trace completion into the loop for every chained bio to call.

Thanks.  I think this is correct as far as tracing goes.
However the code still looks a bit odd.

The comment for the handling of bio_chain_endio suggests that the *only*
purpose for that is to avoid deep recursion.  That suggests it should be
at the end of the function.
As it is blk_throtl_bio_endio() and bio_unint() are only called on the
last bio in a chain.
That seems wrong.

I'd be more comfortable if the patch moved the bio_chain_endio()
handling to the end, after all of that.
So the function would end.

if (bio->bi_end_io == bio_chain_endio) {
    bio = __bio_chain_endio(bio);
    goto again;
} else if (bio->bi_end_io)
    bio->bi_end_io(bio);

Jens:  can you see any reason why that functions must only be called on
the last bio in the chain?

Thanks,
NeilBrown


Hi Neil and Jens,

 From the commit message, bio_uninit is put here for bio allocated in
special ways (e.g., on stack), that will not be release by bio_free. For
chained bio, __bio_chain_endio invokes bio_put and release the
resources, so it seems that we don't need to call bio_uninit for chained
bio.

The blk_throtl_bio_endio is used to update the latency for the throttle
group. I think the latency should only be updated after the whole bio is
finished?

To make sense for the "tail call optimization" in the comment, I'll
suggest to wrap the whole statement with an else. What do you think?

if (bio->bi_end_io == bio_chain_endio) {
     bio = __bio_chain_endio(bio);
     goto again;
} else {
     blk_throtl_bio_endio(bio);
     /* release cgroup info */
     bio_uninit(bio);
     if (bio->bi_end_io)
         bio->bi_end_io(bio);
}

Thanks,
Edward Hsieh

Hi Neil and Jens,

Any feedback on this one?

Thank you,
Edward Hsieh >

Hi Neil and Jens,

Any comments?

Thank you,
Edward Hsieh