Re: [RFC] fsblock

From: Nick Piggin
Date: Mon Jul 09 2007 - 21:08:22 EST


On Mon, Jul 09, 2007 at 05:59:47PM -0700, Christoph Lameter wrote:
> On Tue, 10 Jul 2007, Nick Piggin wrote:
>
> > > Hmmm.... I did not notice that yet but then I have not done much work
> > > there.
> >
> > Notice what?
>
> The bad code for the buffer heads.

Oh. Well my first mail in this thrad listed some of the problems
with them.


> > > > - A real "nobh" mode. nobh was created I think mainly to avoid problems
> > > > with buffer_head memory consumption, especially on lowmem machines. It
> > > > is basically a hack (sorry), which requires special code in filesystems,
> > > > and duplication of quite a bit of tricky buffer layer code (and bugs).
> > > > It also doesn't work so well for buffers with non-trivial private data
> > > > (like most journalling ones). fsblock implements this with basically a
> > > > few lines of code, and it shold work in situations like ext3.
> > >
> > > Hmmm.... That means simply page struct are not working...
> >
> > I don't understand you. jbd needs to attach private data to each bh, and
> > that can stay around for longer than the life of the page in the pagecache.
>
> Right. So just using page struct alone wont work for the filesystems.
>
> > There are no changes to the filesystem API for large pages (although I
> > am adding a couple of helpers to do page based bitmap ops). And I don't
> > want to rely on contiguous memory. Why do you think handling of large
> > pages (presumably you mean larger than page sized blocks) is strange?
>
> We already have a way to handle large pages: Compound pages.

Yes but I don't want to use large pages and I am not going to use
them (at least, they won't be mandatory).


> > Conglomerating the constituent pages via the pagecache radix-tree seems
> > logical to me.
>
> Meaning overhead to handle each page still exists? This scheme cannot
> handle large contiguous blocks as a single entity?

Of course some things have to be done per-page if the pages are not
contiguous. I actually haven't seen that to be a problem or have much
reason to think it will suddenly become a problem (although I do like
Andrea's config page sizes approach for really big systems that cannot
change their HW page size).
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/