Re: [PATCH] block: don't make BLK_DEF_MAX_SECTORS too big

From: Ming Lei
Date: Wed Mar 30 2016 - 08:13:15 EST


Hi Shaohua,

On Wed, Mar 30, 2016 at 10:27 AM, Shaohua Li <shli@xxxxxx> wrote:
> On Wed, Mar 30, 2016 at 09:39:35AM +0800, Ming Lei wrote:
>> On Wed, Mar 30, 2016 at 12:42 AM, Shaohua Li <shli@xxxxxx> wrote:
>> > bio_alloc_bioset() allocates bvecs from bvec_slabs which can only
>> > allocate maximum 256 bvec (eg, 1M for 4k pages). We can't bump
>> > BLK_DEF_MAX_SECTORS to exceed this value otherwise bio_alloc_bioset will
>> > fail.
>> >
>> > In the future, we can extend the size either bvec_slabs array is
>> > expanded or the upcoming multipage bvec is added if pages are
>> > contiguous. This one is suitable for stable.
>> >
>> > Fixes: d2be537c3ba (block: bump BLK_DEF_MAX_SECTORS to 2560)
>> > Reported-by: Sebastian Roesner <sroesner-kernelorg@xxxxxxxxxxxxxxxxx>
>> > Cc: stable@xxxxxxxxxxxxxxx (4.2+)
>> > Cc: Ming Lei <ming.lei@xxxxxxxxxxxxx>
>> > Reviewed-by: Jeff Moyer <jmoyer@xxxxxxxxxx>
>> > Signed-off-by: Shaohua Li <shli@xxxxxx>
>> > ---
>> > include/linux/blkdev.h | 6 +++++-
>> > 1 file changed, 5 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
>> > index 7e5d7e0..da64325 100644
>> > --- a/include/linux/blkdev.h
>> > +++ b/include/linux/blkdev.h
>> > @@ -1153,7 +1153,11 @@ extern int blk_verify_command(unsigned char *cmd, fmode_t has_write_perm);
>> > enum blk_default_limits {
>> > BLK_MAX_SEGMENTS = 128,
>> > BLK_SAFE_MAX_SECTORS = 255,
>> > - BLK_DEF_MAX_SECTORS = 2560,
>> > + /*
>> > + * if you change this, please also change bvec_alloc and BIO_MAX_PAGES.
>> > + * Otherwise bio_alloc_bioset will break.
>> > + */
>> > + BLK_DEF_MAX_SECTORS = BIO_MAX_SECTORS,
>>
>> Thinking about it further, it isn't good to change the default max
>> sectors because
>> the patch affects REQ_PC bios too, which don't have the 1Mbytes limit at all.
>
> what breaks setting REQ_PC to 1M limit? I can understand bigger limit might help
> big raid array performance, but REQ_PC isn't the case.

I mean REQ_PC can include at most 1024 vectors intead of 256, so looks it isn't
fair to introduce the strict limit for all kinds of requests.

More importantly, the max sector limit is for limitting max sectors in
a request,
and is used for bios merging, not same with bio's 256 bvecs limit.

>
>> So suggest to just change bcache's queue max sector limit to 1M, that means
>> we shouldn't encourage bcache's usage of bypassing bio_add_page().
>
> Don't think this is a good idea. This is a limitation of block core,

This bio's 256 bvecs limitation is from block implementation, think about why
one bvec just includes one page, instead of one segment. In the future, it can
be improved absolutely, that is why I said it isn't good to use BIO_MAX_SECTORS.
Also you can find that there is only one user of BIO_MAX_SECTORS.

> block core should make sure the limitation doesn't break, not the
> driver. On the other hand, are you going to fix all drivers? drivers can
> set arbitrary max sector limit.

The issue only exists if drivers(fs, dm, md, bcache) do not use bio_add_page().
All this kind of usage shouldn't be encouraged.

So how about fixing the issue by introducing the limit into get_max_io_size()?
Such as, add something like below at the end of this function?

sectors = min_t(unsigned, sectors, BIO_MAX_PAGES <<
(PAGE_CACHE_SHIFT - 9));


thanks,
Ming

>
> Thanks,
> Shaohua