Re: [PATCH 3/6] fs: Convert block_read_full_page to be synchronous

From: Matthew Wilcox
Date: Fri Oct 23 2020 - 09:21:41 EST


On Thu, Oct 22, 2020 at 04:40:11PM -0700, Eric Biggers wrote:
> On Thu, Oct 22, 2020 at 10:22:25PM +0100, Matthew Wilcox (Oracle) wrote:
> > +static int readpage_submit_bhs(struct page *page, struct blk_completion *cmpl,
> > + unsigned int nr, struct buffer_head **bhs)
> > +{
> > + struct bio *bio = NULL;
> > + unsigned int i;
> > + int err;
> > +
> > + blk_completion_init(cmpl, nr);
> > +
> > + for (i = 0; i < nr; i++) {
> > + struct buffer_head *bh = bhs[i];
> > + sector_t sector = bh->b_blocknr * (bh->b_size >> 9);
> > + bool same_page;
> > +
> > + if (buffer_uptodate(bh)) {
> > + end_buffer_async_read(bh, 1);
> > + blk_completion_sub(cmpl, BLK_STS_OK, 1);
> > + continue;
> > + }
> > + if (bio) {
> > + if (bio_end_sector(bio) == sector &&
> > + __bio_try_merge_page(bio, bh->b_page, bh->b_size,
> > + bh_offset(bh), &same_page))
> > + continue;
> > + submit_bio(bio);
> > + }
> > + bio = bio_alloc(GFP_NOIO, 1);
> > + bio_set_dev(bio, bh->b_bdev);
> > + bio->bi_iter.bi_sector = sector;
> > + bio_add_page(bio, bh->b_page, bh->b_size, bh_offset(bh));
> > + bio->bi_end_io = readpage_end_bio;
> > + bio->bi_private = cmpl;
> > + /* Take care of bh's that straddle the end of the device */
> > + guard_bio_eod(bio);
> > + }
>
> The following is needed to set the bio encryption context for the
> '-o inlinecrypt' case on ext4:
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 95c338e2b99c..546a08c5003b 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -2237,6 +2237,7 @@ static int readpage_submit_bhs(struct page *page, struct blk_completion *cmpl,
> submit_bio(bio);
> }
> bio = bio_alloc(GFP_NOIO, 1);
> + fscrypt_set_bio_crypt_ctx_bh(bio, bh, GFP_NOIO);
> bio_set_dev(bio, bh->b_bdev);
> bio->bi_iter.bi_sector = sector;
> bio_add_page(bio, bh->b_page, bh->b_size, bh_offset(bh));

Thanks! I saw that and had every intention of copying it across.
And then I forgot. I'll add that. I'm also going to do:

- __bio_try_merge_page(bio, bh->b_page, bh->b_size,
- bh_offset(bh), &same_page))
+ bio_add_page(bio, bh->b_page, bh->b_size,
+ bh_offset(bh)))

I wonder about allocating bios that can accommodate more bvecs. Not sure
how often filesystems have adjacent blocks which go into non-adjacent
sub-page blocks. It's certainly possible that a filesystem might have
a page consisting of DDhhDDDD ('D' for Data, 'h' for hole), but how
likely is it to have written the two data chunks next to each other?
Maybe with O_SYNC?

Anyway, this patchset needs some more thought because I've just seen
the path from mpage_readahead() to block_read_full_page() that should
definitely not be synchronous.