>I tested 2.0.36, 2.2.5 & 2.2.5arca12 with the program appended to the
Arca12 is been my first and only release with rbtrees in the page cache.
Unfortunatly it has also a silly bug that can lead to a panic (be quiet,
no mm corruption though ;).
>end of this message. The program builds a large gdbm file. It's a
I am very happy to have a testcase!! I'll try it ASAP.
> user sys elapsed worst 2nd user sys elapsed worst 2nd comments
>2.0.36 27.69 7.70 2:37.27 9.73 6.88 28.97 10.21 17:30.68 76.37 75.11 poor
> 28.32 7.33 0:43.87 3.36 2.32
> 27.52 7.40 0:43.08 3.42 2.22
>
>2.2.5 28.13 6.80 2:50.52 8.00 7.19 29.44 8.70 13:21.34 61.29 64.95 fine
> 28.64 6.61 1:47.39 7.91 6.77
> 28.55 6.40 1:48.49 7.88 7.11
>
>2.2.5arca12 27.99 6.77 1:24.68 14.93 9.38 30.13 6.91 4:13.60 14.08 13.09 very poor
^^^^^^^ ;)) ^^^^^^^^^^^^^^ ;)) ^^^^^^:((
> 28.98 6.69 1:19.67 14.12 9.56
> 28.18 6.77 1:18.99 13.89 9.39
Looking the bench rbtrees in the page cache are not obviously a lose. Well
I don't say that they are a win but sure they are not harming performances
too much ;)).
The bad iteractive feeling you noticed has nothing to do with rb-trees
because they are only in the page cache (the buffer cache of arca12 was
_still_ ;) using a fuzzy-hash).
I just know _exactly_ why arca12 has a worse worst-case and bad iteractive
feel while larging tons of data to disk. The culprit is buffer.c and the
reason 2.2.5 is not stalling is due the set_writetime misfeature that on
the other side is harming a lot performances in the stock kernel.
My current tree should be far better than my last 2.2.5_arca12 for both
performances and iteractiveness. Now I am running with both buffer cache
and page cache in rbtrees (but now I have only one rbtree for the page
cache, this improved performances probably because the rb-entries are now
more likely to be present in L1 cache). My new rbtrees are definitely
faster than the hashtable (at least here with 128mbyte of RAM and compared
to 2.2.6 hashfunction with 12bits of depth).
Now I also changed the behaviour of the syncing of dirty buffers. Now it's
the task itself that sync the buffers to disk when the limit is reached. I
am a bit worried by stack overflow though, but there's to say that where
the code is doing a mark_dirty_buffer, at the same depth, the code
probably is just doing also some ll_rw_block so it should be fine even if
with the scsi-stack-eater code ;). I don't have scsi hardware though so I
can't try it (well really I have the parallel zip but just to avoid stack
overflow ppa it's using bh heavily so probably it's not the best testcase
for stack overflow...).
>arca12 is fast in getting the data to disk, but very poor in
>interactive feel - much worse than 2.0.36. Maybe it just needs
>tuning? If so, it should probably ship with different parameters. The
This need fixage I know and I am working on it in all my spare time.
I still have some obvious thing to improve and to do that I'll have to
break the inode->ops->readpage/writepage API, than I'll try your testcase
and I'll work on it.
I think I'll release a 2.2.6_arca1.bz2 without my readpage/writepage idea
(that I have not yet implemented) so you'll be able to try it out if with
my last code something changed or not.
Note also that if you only go OOM my tree is far better (reported as 10
times better from some tester) as iteractive feel than the stock kernel.
If you instead burst large write to the fs my tree has/had some iteractive
trouble that as just said it's hided in the stock kernel by the
set_writetime misfeature. I hope I'll be able to fix it very soon though,
so we'll get both great performances and great iteractive feeling (all the
time ;).
Andrea Arcangeli
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/