Re: [PATCH 17/18] fs: icache remove inode_lock

From: Nick Piggin
Date: Fri Oct 15 2010 - 16:56:54 EST


On Sat, Oct 16, 2010 at 07:50:45AM +1100, Nick Piggin wrote:
> On Fri, Oct 15, 2010 at 09:59:43PM +1100, Dave Chinner wrote:
> > On Fri, Oct 15, 2010 at 05:41:50PM +1100, Nick Piggin wrote:
> > > You're worried about mere mortals reviewing and understanding it...
> > > I don't really know. If you understand inode locking today, you
> > > can understand the inode scaling series quite easily. Ditto for
> > > dcache. rcu-walk path walking is trickier, but it is described in
> > > detail in documentation and changelog.
> > >
> > > And you can understand the high level approach without exactly
> > > digesting every detail at once. The inode locking work goes to
> > > break up all global locks:
> > >
> > > - a single inode object is protected (to the same level as
> > > inode_lock) with i_lock. This makes it really trivial for
> > > filesystems to lock down the object without taking a global
> > > lock.
> >
> > Which is unnecessarily wide, and results in i_lock having to have
> > list locks nested inside it, and that leads to the lock
> > inversion try-lock mess that several people have complained about.
>
> Gee, you keep repeating this so often that you have me doubting
> myself a tiny bit, so I have to check.
>
> $ grep spin_trylock fs/inode.c fs/fs-writeback.c
> fs/inode.c: if (!spin_trylock(&inode->i_lock)) {
> fs/inode.c: if (!spin_trylock(&old->i_lock)) {
> fs/inode.c: if (!spin_trylock(&old->i_lock)) {
> fs/fs-writeback.c: if (!spin_trylock(&inode->i_lock)) {
> fs/fs-writeback.c: if (!spin_trylock(&inode->i_lock)) {
> fs/fs-writeback.c: if (!spin_trylock(&inode->i_lock)) {
>
> This is your unmaintainable mess? You decided to rewrite your own

The other thing that strikes my mind is that maybe you're looking
at intermediate steps in the series, which admittedly grow ugly
before they start to get clean.

This was a deliberate choice in how I set out the patch series, to
show very clearly every small step where and how locking was being
changed along the way. It is made so it can be followed (from the
patch series) by fs developers or anyone porting code to it almost.
It is absolutely not supposed to look pretty or be run with only
the first steps done. It is supposed to be verifiable, auditable,
and bisectable.

I've split it up like this since I first posted anything to any
list and asked for comments. If you object to that now, I find it
a bit rich. I will change the way it is done if Al tells me to,
but ugly in the intermediate steps don't mean a thing if you're
ignoring the end result.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/