Re: [PATCH 06/14] block: lift setting the readahead size into the block layer

From: Christoph Hellwig
Date: Wed Sep 02 2020 - 11:16:38 EST


On Wed, Aug 26, 2020 at 06:07:38PM -0400, Mike Snitzer wrote:
> On Sun, Jul 26 2020 at 11:03am -0400,
> Christoph Hellwig <hch@xxxxxx> wrote:
>
> > Drivers shouldn't really mess with the readahead size, as that is a VM
> > concept. Instead set it based on the optimal I/O size by lifting the
> > algorithm from the md driver when registering the disk. Also set
> > bdi->io_pages there as well by applying the same scheme based on
> > max_sectors.
> >
> > Signed-off-by: Christoph Hellwig <hch@xxxxxx>
> > ---
> > block/blk-settings.c | 5 ++---
> > block/blk-sysfs.c | 1 -
> > block/genhd.c | 13 +++++++++++--
> > drivers/block/aoe/aoeblk.c | 2 --
> > drivers/block/drbd/drbd_nl.c | 12 +-----------
> > drivers/md/bcache/super.c | 4 ----
> > drivers/md/dm-table.c | 3 ---
> > drivers/md/raid0.c | 16 ----------------
> > drivers/md/raid10.c | 24 +-----------------------
> > drivers/md/raid5.c | 13 +------------
> > 10 files changed, 16 insertions(+), 77 deletions(-)
>
>
> In general these changes need a solid audit relative to stacking
> drivers. That is, the limits stacking methods (blk_stack_limits)
> vs lower level allocation methods (__device_add_disk).
>
> You optimized for lowlevel __device_add_disk establishing the bdi's
> ra_pages and io_pages. That is at the beginning of disk allocation,
> well before any build up of stacking driver's queue_io_opt() -- which
> was previously done in disk_stack_limits or driver specific methods
> (e.g. dm_table_set_restrictions) that are called _after_ all the limits
> stacking occurs.
>
> By inverting the setting of the bdi's ra_pages and io_pages to be done
> so early in __device_add_disk it'll break properly setting these values
> for at least DM afaict.

ra_pages never got inherited by stacking drivers, check it by modifying
it on an underlying device and then creating a trivial dm or md one.
And I think that is a good thing - in general we shouldn't really mess
with this thing from drivers if we can avoid it. I've kept the legacy
aoe and md parity raid cases, out of which the first looks pretty weird
and the md one at least remotely sensible.

->io_pages is still inherited in disk_stack_limits, just like before
so no change either.