On Tue, Sep 11, 2001 at 10:43:40PM +0200, Manfred Spraul wrote:
> Interesting: you loose all microbenchmarks, your patch doesn't improve
> LIFO ordering, and you still think your patch is better? Could you
> explain why?
I believe it's much cleaner to have proper list. It also shrinks the
caches in fifo order (shrinking is fifo and allocation is lifo).
The main reason it was significantly slower in the microbenchmarks is
that it wasn't microoptimized, the design of the three list wasn't the
real problem. I wrote this microbenchmark:
#include <linux/module.h>
#include <linux/slab.h>
#include <asm/msr.h>
#define NR_MALLOC 1000
#define SIZE 10
int init_module(void)
{
unsigned long start, stop;
int i;
void * malloc[NR_MALLOC];
rdtscl(start);
for (i = 0; i < NR_MALLOC; i++)
malloc[i] = kmalloc(SIZE, GFP_KERNEL);
rdtscl(stop);
for (i = 0; i < NR_MALLOC; i++)
kfree(malloc[i]);
printk("cycles %lu\n", stop - start);
return 1;
}
these are the figures on UP PII 450mhz:
mainline pre10 (slabs-list patch applied)
cycles 90236
cycles 89812
cycles 90106
mainline but with slabs-list backed out
cycles 85187
cycles 85386
cycles 85169
mainline but with slabs-list backed out and your latest lifo patch
applied but with debugging turned off (also I didn't checked the
offslab_limit but it was applied)
cycles 85578
cycles 85856
cycles 85596
mainline with this new attached patch applied
cycles 85775
cycles 85963
cycles 85901
So in short I'd prefer to merge in mainline the attached patch that
optimizes the slab lists taking full advantage of the three lists.
Never mind to complain if you still have problems with those
microoptimizations applied. (btw, compiled with gcc 3.0.2)
Andrea
This archive was generated by hypermail 2b29 : Sat Sep 15 2001 - 21:00:37 EST