Re: Inode Lock Scalability V7 (was V6)

From: Nick Piggin
Date: Fri Oct 22 2010 - 00:47:18 EST


On Fri, Oct 22, 2010 at 04:07:28AM +0100, Al Viro wrote:
> On Fri, Oct 22, 2010 at 01:34:44PM +1100, Nick Piggin wrote:
>
> > > * walkers of the sb, wb and hash lists can grab ->i_lock at will;
> > > it nests inside their locks.
> >
> > What about if it is going on or off multiple data structures while
> > the inode is live, like inode_lock can protect today. Such as putting
> > it on the hash and sb list.
>
> Look at the code. You are overengineering it. We do *not* need a framework
> for messing with these lists in arbitrary ways. Where would we need to
> do that to an inode we don't hold a reference to or had placed I_FREEING

Look, my point is that I believe it is an easier step to get from
inode_lock to i_lock, and then from there we can go wild.

What is your criteria for a particular lock ordering being "natural"
versus not? In almost all cases we have

[stuff with data structure] -> [stuff with inode]
and
[stuff with inode] -> [stuff with data structure]

So neither is inherently more natural, I think. So it comes down to
how the code fits together and works.

The difficulty with inode_lock breaking is not the data structures.
We know how to lock and modify them. The hardest part is verifying
that a particular inode has no new, unhandled concurrency introduced
to it (eg. the particular concurrency you pointed out in Dave's patch
just now). Agree?

So in that case, I think it is much more natural to be able to take
an inode and with i_lock, cover it from all icache state concurrency.
I object to it being called over engineering -- it's actually just a
big hammer on the inode, compared with fiddling with more complex
rules.


> on and would need i_lock held by caller? Even assuming that we need to
> keep [present in hash, present on sb list] in sync (which I seriously doubt),
> we can bloody well grab both locks before i_lock.

I'm not saying there is. Most of the problems would be between a
particular inode state versus its membership on one of the lists.
However, with my patches, I *don't care* if there is an issue there
or not. It simply doesn't matter because it has the same protection
as inode_lock at that point.

If you want to micro optimise it, change lock orders around, and
open more concurrency, that is easily possible to do after my patches
lift inode_lock. If you do all the changes *before* inode_lock removal,
then it's not bisectable at all.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/