Re: [00/17] Large Blocksize Support V3

From: Andrew Morton
Date: Sat Apr 28 2007 - 04:24:00 EST

On Sat, 28 Apr 2007 10:04:08 +0200 Peter Zijlstra <a.p.zijlstra@xxxxxxxxx> wrote:

> >
> > The other thing is that we can batch up pagecache page insertions for bulk
> > writes as well (that is. write(2) with buffer size > page size). I should
> > have a patch somewhere for that as well if anyone interested.
> Together with the optimistic locking from my concurrent pagecache that
> should bring most of the gains:
> sequential insert of 8388608 items:
> [ffff81007d7f60c0] insert 0 done in 15286 ms
> [ffff81006b36e040] insert 0 done in 3443 ms
> only 4.4 times faster, and more scalable, since we don't bounce the
> upper level locks around.

I'm not sure what we're looking at here. radix-tree changes? Locking
changes? Both?

If we have a whole pile of pages to insert then there are obvious gains
from not taking the lock once per page (gang insert). But I expect there
will also be gains from not walking down the radix tree once per page too:
walk all the way down and populate all the way to the end of the node.

The implementation could get a bit tricky, handling pages which a racer
instantiated when we dropped the lock, and suitably adjusting ->index. Not
rocket science though.

The depth of the radix tree matters (ie, the file size). 'twould be useful
to always describe the tree's size when publishing microbenchmark results
like this.

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at