Re: [RFC] Bypass filesystems for reading cached pages

From: Andreas Gruenbacher
Date: Thu Jul 02 2020 - 11:17:01 EST


On Wed, Jun 24, 2020 at 2:35 PM Andreas Gruenbacher <agruenba@xxxxxxxxxx> wrote:
> On Mon, Jun 22, 2020 at 8:13 PM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > On Mon, Jun 22, 2020 at 04:35:05PM +0200, Andreas Gruenbacher wrote:
> > > I'm fine with not moving that functionality into the VFS. The problem
> > > I have in gfs2 is that taking glocks is really expensive. Part of that
> > > overhead is accidental, but we definitely won't be able to fix it in
> > > the short term. So something like the IOCB_CACHED flag that prevents
> > > generic_file_read_iter from issuing readahead I/O would save the day
> > > for us. Does that idea stand a chance?
> >
> > For the short-term fix, is switching to a trylock in gfs2_readahead()
> > acceptable?
>
> Well, it's the only thing we can do for now, right?

It turns out that gfs2 can still deadlock with a trylock in
gfs2_readahead, just differently: in this instance, gfs2_glock_nq will
call inode_dio_wait. When there is pending direct I/O, we'll end up
waiting for iomap_dio_complete, which will call
invalidate_inode_pages2_range, which will try to lock the pages
already locked for gfs2_readahead.

This late in the 5.8 release cycle, I'd like to propose converting
gfs2 back to use mpage_readpages. This requires reinstating
mpage_readpages, but it's otherwise relatively trivial.
We can then introduce an IOCB_CACHED or equivalent flag, fix the
locking order in gfs2, convert gfs2 to mpage_readahead, and finally
remove mage_readpages in 5.9.

I'll post a patch queue that does this for comment.

Thanks,
Andreas