Re: [git pull] vfs and fs fixes

From: Steven Whitehouse
Date: Wed Apr 25 2012 - 12:47:51 EST


Hi,

On Wed, 2012-04-25 at 18:26 +0200, Jan Kara wrote:
> On Wed 25-04-12 07:29:30, J. Bruce Fields wrote:
> > On Wed, Apr 25, 2012 at 12:23:12AM +0200, Jan Kara wrote:
> > > On Tue 24-04-12 15:52:36, J. Bruce Fields wrote:
> > > > On Fri, Apr 20, 2012 at 01:15:17PM +0200, Jan Kara wrote:
> > > > > On Wed 18-04-12 00:44:24, Al Viro wrote:
> > > > > > On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote:
> > > > > > > > Or I could increment that counter for all the conflicting operations and
> > > > > > > > rely on it instead of the i_mutex. ?I was trying to avoid adding
> > > > > > > > something like that (an inc, a dec, another error path) to every
> > > > > > > > operation. ?And hoping to avoid adding another field to struct inode.
> > > > > > > > Oh well.
> > > > > > >
> > > > > > > We could just say that we can do a double inode lock, but then
> > > > > > > standardize on the order. And the only sane order is comparing inode
> > > > > > > pointers, not inode numbers like ext4 apparently does.
> > > > > > >
> > > > > > > With a standard order, I don't think it would be at all wrong to just
> > > > > > > take the inode lock on rename.
> > > > > >
> > > > > > In principle, yes, but have you tried to grep for i_mutex? Note that
> > > > > > we have *another* place where multiple ->i_mutex might be held on
> > > > > > non-directories (and unless I'm missing something, ext4 move_extent.c
> > > > > > stuff doesn't play well with it): quota writes. Which can, AFAICS,
> > > > > > happen while write(2) is holding ->i_mutex on a regular file. So
> > > > > > it's not _that_ easy - we want something like "and quota file is goes
> > > > > > last", since there we don't get to change the locking order - the first
> > > > > > ->i_mutex is taken too far outside.
> > > > > Hum, I think I could just do away with quota file i_mutex being special.
> > > > > It's used for two purposes:
> > > > > 1) When quota is being turned on/off, we want to set/clear inode immutable
> > > > > flag, truncate page cache, etc. But we should be able push this locking
> > > > > outside of quota locks.
> > > > > 2) Inside filesystems when quota file is written to. Quota writes are
> > > > > serialized by quota code anyway and noone else has any bussiness with quota
> > > > > files (they are marked as immutable to avoid mistakes) so there i_mutex is
> > > > > not really needed.
> > > >
> > > > Grepping for I_MUTEX_QUOTA shows hits in ext4, reiserfs, and gfs2. The
> > > > former two are in code called from the quota code (through the
> > > > ->quota_write method). But the gfs2 code appears to be called directly
> > > > from gfs2's write code.
> > > Ah, gfs2 doesn't use generic quota code so whatever it does is it's own
> > > invention. For ext4 and reiserfs I could get rid of I_MUTEX_QUOTA as I
> > > wrote.
> >
> > So, just the appended?
> Yup, that's the easier part. We also use the mutex in quota code itself
> (fs/quota/dquot.c). That's somewhat harder to solve but still relatively
> simple.
>
> > But unfortunately as long as that's left in gfs2 we're still stuck
> > trying to order quota files after other files when we take two
> > non-directory i_mutexes elsewhere.
> As far as GFS2 is concerned, I'm not sure what it uses i_mutex in quota
> code for. In any case it should be possible to replace that usage by some
> GFS2 internal lock to get rid of the last usage of I_MUTEX_QUOTA... Stephen?
>
> Honza
>
I'll have a look and see what we can do. I'm not sure off the top of my
head how easy it will be to eliminate this lock,

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/