I did the following experiment: Compiled each of
* plain 2.1.119
* 2.1.119 with AVL trees,
* 2.1.119 with fuzzy hashing patch from Andrey.
For the sake of profiling, I made `find_vma' notinline, and the auxiliary
functions inline, so they don't show up in the histogram. For each of
them, I took the profiling over the following actions:
- let it boot,
- log in and start X with fvwm and an xterm,
- start apache,
- start netscape v3 and browse four directories and one 24k html file,
- unpack, configure and build a GNU package (gperf), this calls g++,
- finally "readprofile -v -m /System.map".
Here are the figures.
2.1.119 2.1.119 2.1.119
plain avl fuzzyhash
vm_enough_memory 4 4 6
vma_insert_hash - - 6
vma_remove - - 3
sys_brk 6 6 8
do_mmap 33 25 19
get_unmapped_area 6 2 3
find_vma 27 44 67
unmap_fixup 4 1 6
do_munmap 18 20 17
build_mmap_avl - 13 -
sys_munmap 1 0 1
exit_mmap 14 5 13
insert_vm_struct 10 23 8
merge_segments 17 36 28
total of mmap.c 140 179 185
total of kernel 6783 6806 6661
Result:
* AVL and fuzzyhash have equal timings, within the standard deviation.(*)
* Both are about 30% slower than plain 2.1.119. The mmap.c code percentage
in the total system time thus increases from 2.1% to 2.6%.
Now for the many-vmas case: In a task with n VMAs, a single VMA operation
(lookup, insert, delete) costs
* "Plain" on average: n/2 VMA accesses.
worst case: n VMA accesses.
* "Fuzzyhash" effectively uses 16 ordered lists instead of 1 ordered list,
hence on average: n/32 VMA accesses.
worst case: n VMA accesses.
The worst case will happen when a program allocates only 64KB aligned
segments.
* "AVL" on average: log(n)/log(2) accesses,
worst case: log(n)/log(2) accesses for lookup,
2*log(n)/log(2) for insert/delete.
So, AVL scales better for large mmap uses. To make "fuzzyhash" scale
better, you would have to increase the number of lists - a traditional
space-time tradeoff. And then AVL still behaves much better in some cases -
say, 2000 64KB-aligned 64KB-sized VMAs.
Bruno
(*) A math book tells me that if the sum of N samples comes out to be N*p,
then the standard deviation is sqrt(N*p*(1-p)). In our case, N = 6800,
p = 0.0206, hence the standard deviation is about 13.5.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/faq.html