Re: Hash functions (was Re: 2.2.6_andrea2.bz2)

Chuck Lever (cel@monkey.org)
Thu, 6 May 1999 11:12:28 -0400 (EDT)


On 6 May 1999, Harvey J. Stein wrote:
> > I believe the hash function used by Chuck Lever uses the latter approach
> > (with a multiplication by a prime number). This also saves you from
> > the burden of having to find the next prime number. Also a multiplication
> > by a big prime constant is usually faster than finding a (non-constant)
> > modulus.
>
> That can't be right. An even times a prime is still even, so you'll
> still miss alternate buckets if your table size is a power of 2. More
> precisely, if a hash_fcn mod 2^m suffers from poor distribution, then
> hash_fcn * prime mod 2^m will have the same poor distribution - 2
> objects that went into the same bucket under hash_fcn will still be in
> the same bucket under hash_fcn * prime because primes are invertible
> mod 2^m.

that's one reason why you don't use the LSB of the multiplication result.
instead, it is right-shifted so you get the middle bits, which are much
more likely to be "random."

please, take a look at

http://www.citi.umich.edu/projects/linux-scalability/reports/hash.html

- Chuck Lever

--
corporate:	<chuckl@netscape.com>
personal:	<chucklever@netscape.net> or <cel@monkey.org>

The Linux Scalability project: http://www.citi.umich.edu/projects/linux-scalability/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/