Re: [RFC PATCH] fs: block_dev: compute nr_vecs hint for improving writeback bvecs allocation

From: Dave Chinner
Date: Wed Jan 06 2021 - 17:22:13 EST


On Wed, Jan 06, 2021 at 04:45:48PM +0800, Ming Lei wrote:
> On Tue, Jan 05, 2021 at 07:39:38PM +0100, Christoph Hellwig wrote:
> > At least for iomap I think this is the wrong approach. Between the
> > iomap and writeback_control we know the maximum size of the writeback
> > request and can just use that.
>
> I think writeback_control can tell us nothing about max pages in single
> bio:

By definition, the iomap tells us exactly how big the IO is going to
be. i.e. an iomap spans a single contiguous range that we are going
to issue IO on. Hence we can use that to size the bio exactly
right for direct IO.

> - wbc->nr_to_write controls how many pages to writeback, this pages
> usually don't belong to same bio. Also this number is often much
> bigger than BIO_MAX_PAGES.
>
> - wbc->range_start/range_end is similar too, which is often much more
> bigger than BIO_MAX_PAGES.
>
> Also page/blocks_in_page can be mapped to different extent too, which is
> only available when wpc->ops->map_blocks() is returned,

We only allocate the bio -after- calling ->map_blocks() to obtain
the iomap for the given writeback range request. Hence we
already know how large the BIO could be before we allocate it.

> which looks not
> different with mpage_writepages(), in which bio is allocated with
> BIO_MAX_PAGES vecs too.

__mpage_writepage() only maps a page at a time, so it can't tell
ahead of time how big the bio is going to need to be as it doesn't
return/cache a contiguous extent range. So it's actually very
different to the iomap writeback code, and effectively does require
a BIO_MAX_PAGES vecs allocation all the time...

> Or you mean we can use iomap->length for this purpose? But iomap->length
> still is still too big in case of xfs.

if we are doing small random writeback into large extents (i.e.
iomap->length is large), then it is trivial to detect that we are
doing random writes rather than sequential writes by checking if the
current page is sequential to the last sector in the current bio.
We already do this non-sequential IO checking to determine if a new
bio needs to be allocated in iomap_can_add_to_ioend(), and we also
know how large the current contiguous range mapped into the current
bio chain is (ioend->io_size). Hence we've got everything we need to
determine whether we should do a large or small bio vec allocation
in the iomap writeback path...

Cheers,

Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx