Re: [Cluster-devel] [PATCH v11 16/25] fs: Convert mpage_readpages to mpage_readahead

From: Matthew Wilcox
Date: Tue Jun 16 2020 - 20:32:30 EST


On Wed, Jun 17, 2020 at 12:36:13AM +0200, Andreas Gruenbacher wrote:
> Am Mi., 15. Apr. 2020 um 23:39 Uhr schrieb Matthew Wilcox <willy@xxxxxxxxxxxxx>:
> > From: "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx>
> >
> > Implement the new readahead aop and convert all callers (block_dev,
> > exfat, ext2, fat, gfs2, hpfs, isofs, jfs, nilfs2, ocfs2, omfs, qnx6,
> > reiserfs & udf). The callers are all trivial except for GFS2 & OCFS2.
>
> This patch leads to an ABBA deadlock in xfstest generic/095 on gfs2.
>
> Our lock hierarchy is such that the inode cluster lock ("inode glock")
> for an inode needs to be taken before any page locks in that inode's
> address space.

How does that work for ...

writepage: yes, unlocks (see below)
readpage: yes, unlocks
invalidatepage: yes
releasepage: yes
freepage: yes
isolate_page: yes
migratepage: yes (both)
putback_page: yes
launder_page: yes
is_partially_uptodate: yes
error_remove_page: yes

Is there a reason that you don't take the glock in the higher level
ops which are called before readhead gets called? I'm looking at XFS,
and it takes the xfs_ilock SHARED in xfs_file_buffered_aio_read()
(called from xfs_file_read_iter).

Not that after -rc1 is a great time to be upending the locking model in
a filesystem ... but then, this has been baking in -mm for ten weeks and
the GFS2 mailing list has been on the cc for the patches for five months,
so I don't have a lot of sympathy for this.