Re: [PATCH][RFC] fast file mapping for loop

From: Jens Axboe
Date: Tue Jan 15 2008 - 05:08:05 EST


On Tue, Jan 15 2008, Jens Axboe wrote:
> On Tue, Jan 15 2008, Jens Axboe wrote:
> > On Mon, Jan 14 2008, Jens Axboe wrote:
> > > On Mon, Jan 14 2008, Chris Mason wrote:
> > > > Hello everyone,
> > > >
> > > > Here is a modified version of Jens' patch. The basic idea is to push
> > > > the mapping maintenance out of loop and down into the filesystem (ext2
> > > > in this case).
> > > >
> > > > Two new address_space operations are added, one to map
> > > > extents and the other to provide call backs into the FS as io is
> > > > completed.
> > > >
> > > > Still TODO for this patch:
> > > >
> > > > * Add exclusion between filling holes and readers. This is partly
> > > > implemented, when a hole is filled by the FS, the extent is flagged as
> > > > having a hole. The idea is to check this flag before trying to read
> > > > the blocks and just send back all zeros.
> > > >
> > > > The flag would be cleared when the blocks filling the hole have been
> > > > written.
> > > >
> > > > * Exclude page cache readers and writers
> > > >
> > > > * Add a way for the FS to request a commit before telling the higher
> > > > layers the IO is complete. This way we can make sure metadata related
> > > > to holes is on disk before claiming the IO is really done. COW based
> > > > filesystems will also needed it.
> > > >
> > > > * Change loop to use fast mapping only when the new address_space op is
> > > > provided (whoops, forgot about this until just now)
> > > >
> > > > * A few other bits for COW, not really relevant until there
> > > > is...something COW using it.
> > >
> > > Looks pretty good. Essentially the loop side is 100% the same, it just
> > > offloads the extent ownership to the fs (where it belongs). So I like
> > > it. Attaching a small cleanup/fixup patch for loop, don't think it needs
> > > further explanations.
> > >
> > > One suggestion - free_extent_map(), I would call that put_extent_map()
> > > instead.
> >
> > I split and merged the patch into five bits (added ext3 support), so
> > perhaps that would be easier for people to read/review. Attached and
> > also exist in the loop-extent_map branch here:
> >
> > http://git.kernel.dk/?p=linux-2.6-block.git;a=shortlog;h=loop-extent_map
>
> Seems my ext3 version doesn't work, it craps out in
> ext3_get_blocks_handle() triggering this bug:
>
> J_ASSERT(handle != NULL || create == 0);
>
> I'll see if I can fix that, being fairly fs ignorant...

This works, but probably pretty suboptimal (should end the new journal
in map_io_complete()?). And yes I know the >> 9 isn't correct, since the
fs block size is larger. Just making sure that we always have enough
blocks.

Punting to Chris!

diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 55e677d..e97181a 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -1002,11 +1002,25 @@ static struct extent_map *ext3_map_extent(struct address_space *mapping,
gfp_t gfp_mask)
{
struct extent_map_tree *tree = &EXT3_I(mapping->host)->extent_tree;
+ handle_t *handle = NULL;
+ struct extent_map *ret;

- return map_extent_get_block(tree, mapping, start, len, create, gfp_mask,
+ if (create) {
+ handle = ext3_journal_start(mapping->host, len >> 9);
+ if (IS_ERR(handle))
+ return (struct extent_map *) handle;
+ }
+
+ ret = map_extent_get_block(tree, mapping, start, len, create, gfp_mask,
ext3_get_block);
+ if (handle)
+ ext3_journal_stop(handle);
+
+ return ret;
}

+
+
/*
* `handle' can be NULL if create is zero
*/

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/