Re: [rfc][patch] dynamic resizing dentry hash using RCU

From: Nick Piggin
Date: Fri Feb 23 2007 - 20:08:48 EST


On Fri, Feb 23, 2007 at 05:31:17PM +0100, Eric Dumazet wrote:
> On Friday 23 February 2007 16:37, Nick Piggin wrote:
> > The dentry hash uses up 8MB for 1 million entries on my 4GB system is one
> > of the biggest wasters of memory for me. Because I rarely have more than
> > one or two hundred thousand dentries. And that's with several kernel trees
> > worth of entries. Most desktop and probably even many types of servers will
> > only use a fraction of that.
> >
> > So I introduce a new method for resizing hash tables with RCU, and apply
> > that to the dentry hash.
> >
> > The primitive heuristic is that the hash size is doubled when the number of
> > entries reaches 150% the hash size, and halved when the number is 50%.
> > It should also be able to shrink under memory pressure, and scale up as
> > large as we go.
> >
> > A pity it uses vmalloc memory for the moment.
> >
> > The implementation is not highly stress tested, but it is running now. It
> > could do a bit more RCU stuff asynchronously rather than with
> > synchronize_rcu, but who cares, for now.
> >
> > The hash is costing me about 256K now, which is a 32x reduction in memory.
> >
> > I don't know if it's worthwhile to do this, rather than move things to
> > other data structures, but something just tempted me to have a go! I'd be
> > interested to hear comments, and how many holes people can spot in my
> > design ;)
> >
> > Thanks,
> > Nick
>
> Hi Nick
>
> Thats a really good idea !
>
> The vmalloc() thing could be a problem, so :
>
> Could you bring back the support of 'dhash_entries=262144' boot param, so that
> an admin could set the initial size of dhash table, (and not shrink it under
> this size even if the number of dentries is low)

Hi Eric,

Yeah, that's a good idea. I'll look at doing that.

> In case dhash_entries is set in boot params, we could try to use
> alloc_large_system_hash() for the initial table, (eventually using Hugepages
> (not vmalloc)), if we add a free_large_system_hash() function to be able to
> free the initial table.
>
> Or else, time is to add the possibility for vmalloc() to use hugepages
> itself...

That sounds like a nice idea to have a hugepage vmalloc for very large
allocations. The big NUMA guys already use vmalloc to allocate large hashes,
so hugepages would probably be a big win for them.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/