Re: [PATCH 3/3] fs: rcu protect inode hash lookups

From: Paul E. McKenney
Date: Tue Nov 02 2010 - 08:11:27 EST


On Tue, Nov 02, 2010 at 11:01:52AM +1100, Dave Chinner wrote:
> On Mon, Nov 01, 2010 at 04:29:22PM +0100, Eric Dumazet wrote:
> > Le mardi 02 novembre 2010 Ã 00:44 +1100, Dave Chinner a Ãcrit :
> >
> > > Perhaps you should rename that file "slab_destroy_by_rcu-tips.txt",
> > > because the current name seems unrelated to the contents. :/
> > >
> >
> > Hmm, I dont know, this doc really is about the nulls thing.
>
> Ok, now I understand - there's a new list type that I didn't know
> about called hlist_nulls. not surprising - it's not documented
> anywhere.
>
> Maybe explicitly describing what the list_null pattern Ñs or a
> pointer to linux/include/list_nulls.h might be appropriate, because
> I managed to read that documentation and not realise that it was
> refering to a specific type of list that was already implemented
> rather than a simple marker technique.
>
> > This stuff also addressed one problem I forgot to tell you about: During
> > a lookup, you might find an item that is moved to another chain by
> > another cpu, so your lookup is redirected to another chain. You can miss
> > your target.
>
> <groan>
>
> So, to go to per-chain locks as per the proposed bit-lock-on-the-
> low-bit-of-the-head-pointer infrastructure, we'll have to cross that
> with the hlist_null code that plays low-bit pointer games for
> detecting the end of the chain.
>
> That's just messy - another hash chain specific scalability hackup.
> My dislike of using hash tables for unbounded caches is not
> improved by this....

Indeed, using call_rcu() or synchronize_rcu() guarantees that the identity
of a given RCU-protected structure will not change while you remain in a
given RCU read-side critical section. In contrast, SLAB_DESTROY_BY_RCU
only guarantees that the type of the object will remain the same. So this
is the usual complexity/speed tradeoff. Use of call_rcu() gives the freed
memory more time to grow cache-cold, and synchronize_rcu() further blocks
the caller for several milliseconds, but allows much simpler identity
checks, often permitting you to dispense entirely with identity checks.

In contrast, SLAB_DESTROY_BY_RCU avoid cache-coldness, but requires
much more careful identity checks.

> > You must find a way to detect such thing to restart the lookup at
> > the beginning (of the right chain). Either you stick the chain
> > number in a new inode field (that costs extra memory), or you use
> > the 'nulls' value that can let you know if you ended your lookup
> > in the right chain.
>
> The chain is determined by hashing the inode number. Perhaps a
> simple enough test is to hash the last inode number on a cache miss
> and if that doesn't match the hash of the lookup key we redo the
> search. That seems to me like it will avoid needing to play games
> with termination markers - does that sound reasonable?

As long as you do the test under a lock that prevents the inode's identity
from changing. Not sure what you should do about hash collisions, though.
Perhaps recheck the incoming pointer?

Thanx, Paul
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/