Re: [PATCH,RFC] faster kmalloc lookup

From: Manfred Spraul (manfred@colorfullife.com)
Date: Sun Oct 27 2002 - 08:29:19 EST


I've run my slab microbenchmark over the 3 versions:
- current
- generic_fls
- i386 asm optimized fls

The test reports the fastest time for 100 kmalloc calls in a tight loop
(Duron 700). Loop/test overhead substracted.

32-byte alloc:
current: 41 ticks
generic_fls: 56 ticks
bsrl: 54 ticks

4096 byte alloc: 84 ticks
generic_fls: 53 ticks
bsrl: 54 ticks

40 ticks difference for -current between 4096 and 32 bytes - ~4 cycles
for each loop.
bit scan is 10 ticks slower for 32 byte allocs, 30 ticks faster for 4096
byte allocs.

No difference between generic_fls and bsrl - the branch predictor can
easily predict all branches in generic_fls for constant kmalloc calls.

--
    Manfred

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Oct 31 2002 - 22:00:33 EST