i was using the random tables to mix up the bits in each byte, like so:
unsigned char rand1[256] = { ... };
unsigned char rand2[256] = { ... };
#define FIRSTBYTE(x) (0xff & (x))
#define SECNDBYTE(x) (0xff & ((x)>>8))
#define THIRDBYTE(x) (0xff & ((x)>>16))
#define hashfn(key) ( (rand1[FIRSTBYTE(key)] << 4) ^ \
(rand2[SECNDBYTE(key)] << 2) ^ \
(rand1[THIRDBYTE(key)]) )
to generate a 12-bit hash table index. this way i don't have to mask off
extra bits as the final step, and i can use daintily sized random tables.
> In Linux x86, the size of the hash table is 2048 (1 << PAGE_HASH_BITS),
> so your hash function, as implemented, would be very bad.
the page cache hash function, btw, already works very well, and i don't
believe replacing it with any other kind of function will help. the only
hash function we should be concerned with improving is the buffer cache
hash function: see fs/buffer.c, ll. 425-30.
i re-read Knuth over dinner last night. unfortunately, he is discussing
hashes where you store data elements right in the hash table, and can then
easily do linear probing [i heard that, you in the second row]. i can't
think of a good way to apply his double hashing algorithm (algorithm D, p.
528) or Brent's variation (p. 532) to help improve the hash function
without using a multiplicative hash.
- Chuck Lever
-- corporate: <chuckl@netscape.com> personal: <chucklever@netscape.net> or <cel@monkey.org>The Linux Scalability project: http://www.citi.umich.edu/projects/citi-netscape/
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/