Re: [RFC] B+Tree library

From: Johannes Berg
Date: Fri Oct 31 2008 - 07:33:02 EST


On Fri, 2008-10-31 at 12:26 +0100, JÃrn Engel wrote:
> On Fri, 31 October 2008 11:35:14 +0100, Johannes Berg wrote:
> >
> > [looks like this hitting LWN means everyone suddenly found it...]
>
> It did?

Yes, on the weekly kernel page, under "Patches and updates / Core kernel
code"

> > > The one aspect where my implementation is actually nice is in allowing
> > > variable key length. Btree keys are interpreted as an array of unsigned
> > > long. So by passing the correct geometry to the core functions, it is
> > > possible to handle 32bit, 64bit or 128bit btrees, which logfs uses. If
> > > so desired, any other weird data format can be used as well (Zach, are
> > > you reading this?).
> >
> > Would there be an easy way to use 48-bit keys? Or variable length keys?
>
> Variable as in one implementation for several trees with different
> sizes, yes. Variable as in one tree with differently sized keys, no.

Ok, I guess that would blow up the key size to 6+1+32 bytes, or 10 (5)
longs. Bit large.

> > > So would something like this be merged once some users are identified?
> > > Would it be useful for anything but logfs? Or am I plain nuts?
> >
> > I could imagine using it instead of the hash-table for stations and APs
> > in the wireless code, stations are identified by the MAC address (48
> > bit) and APs (BSSs) are identified by their BSSID+SSID (or mesh ID), so
> > variable length. Currently we use a hash table with 256 slots which is
> > quite large for the typical case of mostly less than a hundred entries.
>
> MAC address wouldn't be a problem. For BSSID+SSID one could just keep
> the hashing and use the btree as an 'implementation detail' of the
> 'hashtable'. Beware that I don't allow two identical keys. The
> implications of that make my toenails curl up.

Heh. We don't actually hash on the SSID right now, only on the BSSID,
and then use the SSID to distinguish (it isn't common to have two SSIDs
on the same BSSID)

> > > This implementation is extremely simple. It splits nodes when they
> > > overflow. It does not move elements to neighboring nodes. It does not
> > > try fancy 2:3 splits. It does not even merge nodes when they shrink,
> > > making degenerate cases possible. And it requires callers to do
> > > tree-global locking. In effect, it will be hard to find anything less
> > > sophisticated.
> >
> > I think the wireless case would probably want to have real shrinking
> > because it's well possible that you're moving, with your laptop, from an
> > area with a large number of APs to say your home out in the countryside
> > that only has your single AP.
>
> Indeed.

I guess not then for now, unless one of us wants to implement
resizing... I'll look for replacements another time, nobody has
complained yet about the huge hash table :)

johannes

Attachment: signature.asc
Description: This is a digitally signed message part