Re: [PATCH 1/12] radix_tree: exceptional entries and indices

From: Hugh Dickins
Date: Tue Jul 12 2011 - 18:56:38 EST


On Sat, 18 Jun 2011, Andrew Morton wrote:
> On Fri, 17 Jun 2011 17:13:38 -0700 (PDT) Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> > On Fri, 17 Jun 2011, Andrew Morton wrote:
> > > On Tue, 14 Jun 2011 03:42:27 -0700 (PDT)
> > > Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> > >
> > > > The low bit of a radix_tree entry is already used to denote an indirect
> > > > pointer, for internal use, and the unlikely radix_tree_deref_retry() case.
> > > > Define the next bit as denoting an exceptional entry, and supply inline
> > > > functions radix_tree_exception() to return non-0 in either unlikely case,
> > > > and radix_tree_exceptional_entry() to return non-0 in the second case.
> > >
> > > Yes, the RADIX_TREE_INDIRECT_PTR hack is internal-use-only, and doesn't
> > > operate on (and hence doesn't corrupt) client-provided items.
> > >
> > > This patch uses bit 1 and uses it against client items, so for
> > > practical purpoese it can only be used when the client is storing
> > > addresses. And it needs new APIs to access that flag.
> > >
> > > All a bit ugly. Why not just add another tag for this? Or reuse an
> > > existing tag if the current tags aren't all used for these types of
> > > pages?
> >
> > I couldn't see how to use tags without losing the "lockless" lookups:
>
> So lockless pagecache broke the radix-tree tag-versus-item coherency as
> well as the address_space nrpages-vs-radix-tree coherency.

I don't think that remark is fair to lockless pagecache at all. If we
want the scalability advantage of lockless lookup, yes, we don't have
strict coherency with tagging at that time. But those places that need
to worry about that coherency, can lock to do so.

> Isn't it fun learning these things.
>
> > because the tag is a separate bit from the entry itself, unless you're
> > under tree_lock, there would be races when changing from page pointer
> > to swap entry or back, when slot was updated but tag not or vice versa.
>
> So... take tree_lock?

I wouldn't call that an improvement...

> What effect does that have?

... but admit I have not measured: I rather assume that if we now change
tmpfs from lockless to locked lookup, someone else will soon come up with
the regression numbers.

> It'd better be
> "really bad", because this patchset does nothing at all to improve core
> MM maintainability :(

I was aiming to improve shmem.c maintainability; and you have good grounds
to accuse me of hurting shmem.c maintainability when I highmem-ized the
swap vector nine years ago.

I was not aiming to improve core MM maintainability, nor to harm it.
I am extending the use to which the radix-tree can be put, but is that
so bad?

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/