[TIMINGS] Re: 2.1.xxx makes Electric Fence 22x slower

Bruno Haible (haible@ilog.fr)
Sat, 5 Sep 1998 23:48:21 +0200 (MET DST)


Here are timings of the three alternatives, for the "common" case.
At the end, you find also some discussion of the many-vmas case.

I did the following experiment: Compiled each of

* plain 2.1.119
* 2.1.119 with AVL trees,
* 2.1.119 with fuzzy hashing patch from Andrey.

For the sake of profiling, I made `find_vma' notinline, and the auxiliary
functions inline, so they don't show up in the histogram. For each of
them, I took the profiling over the following actions:

- let it boot,
- log in and start X with fvwm and an xterm,
- start apache,
- start netscape v3 and browse four directories and one 24k html file,
- unpack, configure and build a GNU package (gperf), this calls g++,
- finally "readprofile -v -m /System.map".

Here are the figures.

2.1.119 2.1.119 2.1.119
plain avl fuzzyhash

vm_enough_memory 4 4 6
vma_insert_hash - - 6
vma_remove - - 3
sys_brk 6 6 8
do_mmap 33 25 19
get_unmapped_area 6 2 3
find_vma 27 44 67
unmap_fixup 4 1 6
do_munmap 18 20 17
build_mmap_avl - 13 -
sys_munmap 1 0 1
exit_mmap 14 5 13
insert_vm_struct 10 23 8
merge_segments 17 36 28

total of mmap.c 140 179 185

total of kernel 6783 6806 6661

Result:

* AVL and fuzzyhash have equal timings, within the standard deviation.(*)

* Both are about 30% slower than plain 2.1.119. The mmap.c code percentage
in the total system time thus increases from 2.1% to 2.6%.

Now for the many-vmas case: In a task with n VMAs, a single VMA operation
(lookup, insert, delete) costs

* "Plain" on average: n/2 VMA accesses.
worst case: n VMA accesses.

* "Fuzzyhash" effectively uses 16 ordered lists instead of 1 ordered list,
hence on average: n/32 VMA accesses.
worst case: n VMA accesses.
The worst case will happen when a program allocates only 64KB aligned
segments.

* "AVL" on average: log(n)/log(2) accesses,
worst case: log(n)/log(2) accesses for lookup,
2*log(n)/log(2) for insert/delete.

So, AVL scales better for large mmap uses. To make "fuzzyhash" scale
better, you would have to increase the number of lists - a traditional
space-time tradeoff. And then AVL still behaves much better in some cases -
say, 2000 64KB-aligned 64KB-sized VMAs.

Bruno

(*) A math book tells me that if the sum of N samples comes out to be N*p,
then the standard deviation is sqrt(N*p*(1-p)). In our case, N = 6800,
p = 0.0206, hence the standard deviation is about 13.5.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/faq.html