Re: [PATCH] bcache: fix double bio_endio completion in detached_dev_end_io

From: Stephen Zhang

Date: Thu Jan 15 2026 - 04:18:16 EST


Kent Overstreet <kent.overstreet@xxxxxxxxx> 于2026年1月15日周四 16:59写道:
>
> On Thu, Jan 15, 2026 at 04:06:53PM +0800, Stephen Zhang wrote:
> > zhangshida <starzhangzsd@xxxxxxxxx> 于2026年1月15日周四 15:48写道:
> > >
> > > From: Shida Zhang <zhangshida@xxxxxxxxxx>
> > >
> > > Commit 53280e398471 ("bcache: fix improper use of bi_end_io") attempted
> > > to fix up bio completions by replacing manual bi_end_io calls with
> > > bio_endio(). However, it introduced a double-completion bug in the
> > > detached_dev path.
> > >
> > > In a normal completion path, the call stack is:
> > > blk_update_request
> > > bio_endio(bio)
> > > bio->bi_end_io(bio) -> detached_dev_end_io
> > > bio_endio(bio) <- BUG: second call
> > >
> > > To fix this, detached_dev_end_io() must manually call the next completion
> > > handler in the chain.
> > >
> > > However, in detached_dev_do_request(), if a discard is unsupported, the
> > > bio is rejected before being submitted to the lower level. In this case,
> > > we can use the standard bio_endio().
> > >
> > > detached_dev_do_request
> > > bio_endio(bio) <- Correct: starts completion for
> > > unsubmitted bio
> > >
> > > Fixes: 53280e398471 ("bcache: fix improper use of bi_end_io")
> > > Signed-off-by: Shida Zhang <zhangshida@xxxxxxxxxx>
> > > ---
> > > drivers/md/bcache/request.c | 11 +++++++++--
> > > 1 file changed, 9 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
> > > index 82fdea7dea7..ec712b5879f 100644
> > > --- a/drivers/md/bcache/request.c
> > > +++ b/drivers/md/bcache/request.c
> > > @@ -1104,7 +1104,14 @@ static void detached_dev_end_io(struct bio *bio)
> > > }
> > >
> > > kfree(ddip);
> > > - bio_endio(bio);
> > > + /*
> > > + * This is an exception where bio_endio() cannot be used.
> > > + * We are already called from within a bio_endio() stack;
> > > + * calling it again here would result in a double-completion
> > > + * (decrementing bi_remaining twice). We must call the
> > > + * original completion routine directly.
> > > + */
> > > + bio->bi_end_io(bio);
> > > }
> > >
> > > static void detached_dev_do_request(struct bcache_device *d, struct bio *bio,
> > > @@ -1136,7 +1143,7 @@ static void detached_dev_do_request(struct bcache_device *d, struct bio *bio,
> > >
> > > if ((bio_op(bio) == REQ_OP_DISCARD) &&
> > > !bdev_max_discard_sectors(dc->bdev))
> > > - detached_dev_end_io(bio);
> > > + bio_endio(bio);
> > > else
> > > submit_bio_noacct(bio);
> > > }
> > > --
> > > 2.34.1
> > >
> >
> > Hi,
> >
> > My apologies for the late reply due to a delay in checking my working inbox.
> > I see the issue mentioned in:
> > https://lore.kernel.org/all/aWU2mO5v6RezmIpZ@xxxxxxxxxxxxxx/
> > this was indeed an oversight on my part.
> >
> > To resolve this quickly, I've prepared a direct fix for the
> > double-completion bug.
> > I hope this is better than a full revert.
>
> In general, it's just safer, simpler and saner to revert, reverting a
> patch is not something to be avoided. If there's _any_ new trickyness
> required in the fix, it's better to just revert than rush things.
>
> I revert or kick patches out - including my own - all the time.
>
> That said, this patch is good, you've got a comment explaining what's
> going on. Christoph's version of just always cloning the bio is
> definitely cleaner, but that's a bigger change,

Thank you for the feedback.

I sincerely hope that Christoph's version can resolve this issue properly, and
that it helps remedy the regression I introduced. I appreciate everyone's
patience and the efforts to address this.

Let me know if there's anything further needed from my side.

Best regards,
Shida