Re: [GIT PULL] ocfs2 changes for 2.6.32

From: Joel Becker
Date: Wed Sep 16 2009 - 00:42:33 EST


On Tue, Sep 15, 2009 at 09:20:47PM -0700, Linus Torvalds wrote:
> On Tue, 15 Sep 2009, Joel Becker wrote:
> >
> > Perhaps ->copyfile takes the following flags:
> >
> > #define ALLOW_COW_SHARED 0x0001
> > #define REQUIRE_COW_SHARED 0x0002
> > #define REQUIRE_BASIC_ATTRS 0x0004
> > #define REQUIRE_FULL_ATTRS 0x0008
> > #define REQUIRE_ATOMIC 0x0010
> > #define SNAPSHOT (REQUIRE_COW_SHARED |
> > REQUIRE_BASIC_ATTRS |
> > REQUIRE_ATOMIC)
> > #define SNAPSHOT_PRESERVE (SNAPSHOT | REQUIRE_FULL_ATTRS)
> >
> > Thus, sys_reflink/sys_snapfile(oldpath, newpath, 0) becomes:
> > ...
>
> Yes. The above all sounds sane to me.

Ok. Where do you see the exposure level? What I mean is, I
just defined a vfs op that handles these things, but accessed it via two
syscalls, sys_snapfile() and sys_copyfile(). We could also just provide
one system call and allow userspace to use these flags itself, creating
snapfile(3) and copyfile(3) in libc, hiding the details (kind of like
clone being hidden by pthreads, though ignoring that pthreads has
"issues"). Or we could explicitly make this the public API and expect
something like cp(1) to directly use the flags. Thoughts?

> I still worry that especially the non-atomic case will want some kind of
> partial-copy updates (think graphical file managers that want to show the
> progress of the copy), and that (think EINTR and continuing) makes me
> think "that could get really complex really quickly", but that's something
> that the NFS/SMB people would have to pipe up on. I'm pretty sure the NFS
> spec has some kind "partial completion notification" model, I dunno about
> SMB.

I'm really wary of combining a ranged interface with this one.
Not only does it make no sense for snapshots, but I think it falls down
in any "create a new inode" scheme entirely.
btrfs has an ioctl that basically says "link up range x->y of
file 1 to file 2". Chris is using the underlying machinery to implement
reflink, but I think the concept actually would work nicely as a splice
flag. If you have two existing files, not creating one, you can just
ask splice to do efficient things with a SPLICE_F_EFFICIENT_COPY for
yoru CIFS COPY-style thing or SPLICE_F_COW_COPY for btrfs- and
ocfs2-style data sharing.

Joel

--

"Nothing is wrong with California that a rise in the ocean level
wouldn't cure."
- Ross MacDonald

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker@xxxxxxxxxx
Phone: (650) 506-8127
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/