Re: [git pull] vfs.git - including i_mutex wrappers
From: Dave Chinner
Date: Sat Jan 23 2016 - 19:53:57 EST
On Sat, Jan 23, 2016 at 11:09:44PM +0000, Al Viro wrote:
> On Sun, Jan 24, 2016 at 09:44:35AM +1100, Dave Chinner wrote:
>
> > FWIW, I'm not opposed to making such a locking change - I'm more
> > concerned about the fact I'm finding out about plans for such a
> > fundamental locking change from a pull request on the last day of a
> > merge window....
>
> Look at the commit message (*and* pull request posting) of an earlier vfs.git
> pull request in the beginning of this window. Or into the thread back in
Which one? YOu've sent 8 or 9 pull requests so far this merge
window, and I haven't had time to read all of them closely. Indeed,
you wrote one sentence in a long pull request description that
mentions converting to the inode mutex to a rwsem. Not surprising
that I missed it - I'm sure lots of other people did too....
> May when it had been first proposed (and pretty much the same patch had been
> generated and posted by Linus). Changes needed for parallel ->lookup() had
> been discussed; it was a side branch of one of the RCU symlink threads and
> ISTR your own postings in it.
I don't have the memory of a elephant, and the commit message or
pull requests make no mention of that discussion, nor do I recall
there being any clear resolution in that discussion way back then.
Certainly there's nothing in the the pull req or the commit message
that would make me think "oh, that's what we talked about a year
ago!".
> Basic scheme: have dentries under ->lookup() marked as such and inserted into
> hash (still negative, obviously) before calling ->lookup(). The method itself
> is called with ->i_mutex replacement taken shared; anyone running into such
> dentry in dcache lookup will wait (on parent directory ->i_mutex queue,
> explicitly kicked once ->lookup() is done) and repeat dcache lookup. In
> case when the current code would've silently freed ->lookup() argument (error
> or "I've used an existing dentry") the thing will be unhashed and dropped,
> without ever losing the "it's under lookup" flag. Primitives like
> d_splice_alias() would remove the flag in question.
>
> Anyone running into such sucker in RCU mode should treat it as "dcache miss,
> need to fall back to non-lazy mode". Flag (as all dentry flags) protected
> by ->d_lock.
>
> If a filesystem simply wants to preserve the existing exclusion, it should
> add a private per-inode mutex and take it in its ->lookup() instance; all
> other methods will still get exclusion on ->i_mutex replacement.
>
> There will be interesting prereqs, but for XFS it's a non-issue. Now,
> something like ceph or lustre... <shudder> Again, for XFS (for any
> normal Unix filesystems, really) no extra exclusion should be needed.
>
> readdir() is another potential target for weaker exclusion (i.e. switching
> it to taking that thing shared), but that's a separate story and I'd prefer
> to deal with ->lookup() first. There are potentially hairy issues around
> the instances that pre-seed dcache and I don't want to mix them into the
> initial series.
So you're doing this for purely to enable lookup concurrency, not
for anyone else to be able to use the inode lock as a read/write
lock? Can anyone use the inode rwsem as a read/write lock for their
own purposes? If so, we can probably use it to replace the XFS
IOLOCK and so effectively remove a layer of locking in various
XFS IO paths. What's the policy you are proposing here?
Cheers,
Dave.
--
Dave Chinner
david@xxxxxxxxxxxxx