On Sun, Sep 17, 2017 at 09:34:01AM -0700, Linus Torvalds wrote:
Now, I suspect most (all?) do, but that's a historical artifact ratherThe thing pretty much common to all of them is that write() might need
than "design". In particular, the VFS layer used to do the locking for
the filesystems, to guarantee the POSIX requirements (POSIX requires
that writes be seen atomically).
But that lock was pushed down into the filesystems, since some
filesystems really wanted to have parallel writes (particularly for
direct IO, where that POSIX serialization requirement doesn't exist).
That's all many years ago, though. New filesystems are likely to have
copied the pattern from old ones, but even then..
Also, it's worth noting that "inode->i_rwlock" isn't even well-defined
as a lock. You can have the question of *which* inode gets talked
about when you have things like eoverlayfs etc. Normally it would be
obvious, but sometimes you'd use "file->f_mapping->host" (which is the
same thing in the simple cases), and sometimes it really wouldn't be
obvious at all..
So... I'm really not at all convinced that i_rwsem is sensible. It's
one of those things that are "mostly right for the simple cases",
to modify permissions (suid removal), which brings ->i_rwsem in one
way or another - notify_change() needs that held...