Re: [00/17] Large Blocksize Support V3
From: David Chinner
Date: Fri Apr 27 2007 - 02:09:58 EST
On Thu, Apr 26, 2007 at 10:15:28PM -0700, Andrew Morton wrote:
> On Fri, 27 Apr 2007 14:20:46 +1000 David Chinner <dgc@xxxxxxx> wrote:
>
> > > blocksizes via this scheme - instantiate and lock four pages and go for
> > > it.
> >
> > So now how do you get block aligned writeback?
>
> in writeback and pageout:
>
> if (page->index & mapping->block_size_mask)
> continue;
So we might do writeback on one page in N - how do we
make sure none of the other pages are reclaimed while we are doing
writeback on this bclok?
IOWs, we have to lock every page in the block, mark them all as
writeback, etc. Instead of doing something once, we have
to repeat it for every block in page. This is better than a compound
page, how?
> > Or make sure that truncate
> > doesn't race on a partial *block* truncate?
>
> lock four pages
And the locking order? How do you enforce *kernel wide* the
same locking order for all pages in the same block so that we
don't get ABBA deadlocks on page locks within a block?
i.e:
> > You basically have to
> > jump through nasty, nasty hoops, to handle corner cases that are introduced
> > because the generic code can no longer reliably lock out access to a
> > filesystem block.
This way lies insanity.
> > way to serialise access to these aggregated structures. This is
> > the way XFS used to work in it's data path, and we all know how long
> > and loud people complained about that.....
> >
> > A filesystem specific aggregation mechanism is not a palatable solution
> > here because it drives filesystems away from being able to use generic
> > code.
>
> I would expect we could (should) implement this in generic code by
> modifying the existing stuff.
So you're suggesting that we reintroduce a buffer-oriented filesystem
interface to support large block sizes?
> I'm not saying it's especially simple, nor fast. But it has the advantage
> that we're not forced to use larger pages with _it's_ attendant performance
> problems.
So you'll take slow, inefficient and complex rather than use an
non-intrusive and /optional/ interface to large pages?
Words fail me......
Cheers,
Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/