Re: (reiserfs) reiserfs and knfsd and NFSv4 and volatile file handles

From: Alexander Viro (viro@math.psu.edu)
Date: Thu Mar 16 2000 - 03:10:01 EST


On Wed, 15 Mar 2000, Hans Reiser wrote:

> Not sure I understand your question, but I'll take a stab at it. Currently the
> first directory a file is created in stays its packing locality parent for the
> life of the file. The packing locality parent is the first key component. So,
> currently, if you know the parent directory id the file was created in you know
> the first component of the key, and the last component of the key is the offset
> which you can also know. This leaves the middle of the key unknown.

Ouch. What happens if you
        mkdir foo
        create foo/bar
        rename foo/bar baz/bar
        rmdir foo
and ditto for
        mkdir foo
        create foo/bar
        open foo/bar
        unlink foo/bar
        rmdir foo
- where does your packing locality <whatever> go?

Another question: how does your write_inode() interact with readdir() and
friends on parent? Do they need any sort of exclusion? What kind of
metadata is stored in the directory and is it affected by notify_change()?

One more: suppose you've got
        create foo/bar
        link foo/bar baz/bar
        unlink foo/bar
        create foo/quux unlink baz/bar
Do the last two of them need any sort of exclusion? VFS _does_ allow the
last two happen at the same time. You have big lock, but if one of them
blocks (and you are very unlikely to avoid that) - that's it.

The reason why I'm asking is that I've done a filesystem with metadata in
directory and I know what kind of PITA it can become. _And_ I've dealt
with a monster that keeps metadata in directory trying to support links.
In case if you wonder - it's AFFS. It's racey as hell and basically I'm
trying to figure out how did you solve the same set of problems.

> The immutability of keys is true merely because we don't yet have a repacker to
> do re-optimize once a day, and it will change, but for now it is true. The
> first version of our repacker probably won't change keys, but clearly it should
> eventually have that functionality.

And how do you expect NFS to handle that?

> The notion that keys/objectids should be immutable over the life of the object
> is a notion I once believed in but have since conceded was wrong of me.

        Unfortunately, NFS can't survive without that. POSIX also requires
constant inumbers, BTW.

> So, if you create a file in one directory and then access from another
> directory, we are as inefficient as using conventional inodes. At least until
> the repacker that has not yet been authored gets written by us....
>
> >
> > If you mean that we need ID of parent for knfsd - see our discussion on
> > that stuff. VFS doesn't need to know parents of non-directories.
> >
> > Let me ask again:
> > a) is your "32 bit object ID" sufficient to
> > read/write/chmod/truncate/chown the file? Forget VFS, does the thing
> > contain enough information to do it in principle (aside of scanning the
> > whole tree, indeed)?
> No, we need the directory id plus the objectid (plus the offset) to know what
> the key is.

What's the size of key?

> > b) if you have a directory, is its "32 bit object ID" sufficient to
> > find ID of its parent?
>
> directories are their own parents as far as constructing keys is concerned in
> plan A which is our next version, key construction for the current version puts
> directories in the parent of the directory which is an ~10% performance losing
> notion of a former project member, sigh. If 2.4 had been another month or two
> out we would have fixed it, such is life.... it will wait for 2.6.

Wait a minute. Assume that somebody gave you the key of directory. Can you
find the key of its current parent? Notice that it could very well be
renamed since the moment of its cretion.

> > I assume that you _do_ support link(2) - if you don't I would really like
> > to know what reiserfs_link() does. As in: why it is there and what will
> > actually happen if you'll start calling it.
>
> link works.:-) I think I explained the difficult part above, if you need more
> detail ask.

Erm... I need details on the locking you use within your tree. This scheme
has a huge potential for hidden (and hard to hit unless you know what you
are hunting for) races, unless you are very accurate with that stuff. One
more thing that began to scare me - readdir() and offsets in directory.
How do you do them?
                                                                Al

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Mar 23 2000 - 21:00:18 EST