Re: [PATCH] block: recalculate segment count for multi-segment discard requests correctly

From: Ming Lei
Date: Wed Feb 03 2021 - 21:20:38 EST


On Wed, Feb 03, 2021 at 11:23:37AM -0500, David Jeffery wrote:
> On Wed, Feb 03, 2021 at 10:35:17AM +0800, Ming Lei wrote:
> >
> > On Tue, Feb 02, 2021 at 03:43:55PM -0500, David Jeffery wrote:
> > > The return 0 does seem to be an old relic that does not make sense anymore.
> > > Moving REQ_OP_SECURE_ERASE to be with discard and removing the old return 0,
> > > is this what you had in mind?
> > >
> > >
> > > diff --git a/block/blk-merge.c b/block/blk-merge.c
> > > index 808768f6b174..68458aa01b05 100644
> > > --- a/block/blk-merge.c
> > > +++ b/block/blk-merge.c
> > > @@ -383,8 +383,14 @@ unsigned int blk_recalc_rq_segments(struct request *rq)
> > > switch (bio_op(rq->bio)) {
> > > case REQ_OP_DISCARD:
> > > case REQ_OP_SECURE_ERASE:
> > > + if (queue_max_discard_segments(rq->q) > 1) {
> > > + struct bio *bio = rq->bio;
> > > + for_each_bio(bio)
> > > + nr_phys_segs++;
> > > + return nr_phys_segs;
> > > + }
> > > + /* fall through */
> > > case REQ_OP_WRITE_ZEROES:
> > > - return 0;
> > > case REQ_OP_WRITE_SAME:
> > > return 1;
> >
> > WRITE_SAME uses same buffer, so the nr_segment is still one; WRITE_ZERO
> > doesn't need extra payload, so nr_segments is zero, see
> > blk_bio_write_zeroes_split(), blk_bio_write_same_split, attempt_merge()
> > and blk_rq_merge_ok().
> >
>
> I thought you mentioned virtio-blk because of how some drivers handle
> zeroing and discarding similarly and wanted to align the segment count with
> discard behavior for WRITE_ZEROES too. (Though that would also need an update

virtio-blk is just one example which supports both single discard range
and multiple discard range, meantime virtblk_setup_discard_write_zeroes()
simply maps write zero into discard directly.

Just found blk_rq_nr_discard_segments() returns >=1 segments always, so
looks your patch is enough for avoiding the warning.

> to blk_bio_write_zeroes_split as you pointed out.) So you want me to leave
> WRITE_ZEROES behavior alone and let blk_rq_nr_discard_segments() keep doing
> the hiding of a 0 rq->nr_phys_segments as 1 segment in the WRITE_ZEROES treated
> as a discard case?
>
> diff --git a/block/blk-merge.c b/block/blk-merge.c
> index 808768f6b174..756473295f19 100644
> --- a/block/blk-merge.c
> +++ b/block/blk-merge.c
> @@ -383,6 +383,14 @@ unsigned int blk_recalc_rq_segments(struct request *rq)
> switch (bio_op(rq->bio)) {
> case REQ_OP_DISCARD:
> case REQ_OP_SECURE_ERASE:
> + if (queue_max_discard_segments(rq->q) > 1) {
> + struct bio *bio = rq->bio;
> +
> + for_each_bio(bio)
> + nr_phys_segs++;
> + return nr_phys_segs;
> + }
> + return 1;
> case REQ_OP_WRITE_ZEROES:
> return 0;
> case REQ_OP_WRITE_SAME:

This patch returns 1 for single-range discard explicitly. However, it
isn't necessary because of blk_rq_nr_discard_segments().

Maybe we can align to blk_bio_discard_split() in future, but that can be
done as cleanup.

Thanks,
Ming