Re: xfs: commit 6552321831dc "xfs: remove i_iolock and use i_rwsem in the VFS inode instead" change causes hang

From: Christoph Hellwig
Date: Sun Jan 08 2017 - 13:19:15 EST


On Sun, Jan 08, 2017 at 09:59:25AM -0800, James Bottomley wrote:
> Hey, that's not really true: the inode lock (i_rwsem) is used in all
> sorts of generic places, including generic_file_write_iter(). That's,
> I think, why ima is using it to try to prevent writes while it measures
> the file.

But all these are _below_ file_operations. The only place where take
them in the VFS is for namespace locking, e.g. before calling into
inode_operations (to generalize a little).

>
> > So the answer here is that ima needs to stop playing with i_rwsem.
>
> Isn't there a happy medium? most sensible filesystems will allow shared
> reading (unless they want to tank performance) so we can rely on the
> fact that even if a fs does use i_rwsem internally on the read path, it
> will have to be shared.

At least for direct I/O that doesn't always have to be true.

> So simply replacing the inode_lock() in ima
> with inode_lock_shared() should do what ima wants and not interact
> badly even if the underlying FS uses i_rwsem. If there's ever a FS
> that takes it exclusively in the read path, ima can simply blacklist
> it.

IFF we actually allow recursive readers for rw_semaphores this would
work around the issue (but I'm not sure about that fact, at least
in the past we didn't). It won't fix IMA for all the file systems
use other synchronization for reads, e.g. the cluster locks in ocfs2
or gfs2. It won't fix NFS which will exhibit exacly the same issue
as Mimi reported.

Last but not least it won't solve the problem that IMA has never been
designed and does neither document the requires it has from a file system,
nor is there any systematic testing for it. It will keep on breaking
because it has all kinds of weird implicit assumptions never written down
or verified, and the test coverage for it is basically non-existing.