Re: Linux's future: //posix/ipc, //root and so on ?

From: Hans Reiser (reiser@idiom.com)
Date: Thu Mar 02 2000 - 11:45:57 EST


Andi Kleen wrote:

>
> [NFSv4 has volatile file handles that require the client to store the
> path and relookup when needed, but of course the
> end result will be even farther away from POSIX semantics as current NFS
> is already]

If I am correct in guessing what a volatile file handle is, it sounds like the
right answer to me, that and changing the posix semantics. What exactly is a
volatile file handle? Does it allow a file handle to go bad and then just
relookup? Is it an attempt to cure stale file handles?

>
> >
> > Getting the decent replacement for iget() (i.e. one that would be fs-specific
> > and would return proper dentry) is not hard. Current code tries to be too
> > generic - it wants to be implemented in terms of iget() and <barf> readdir().
> > Which is _really_ wrong way to go - it doesn't make sense for a lot of
> > filesystems _and_ it sucks in terms of locking for the rest. If we make
> > foo_lookup() aware of the situation (== aware of the fact that there may be
> > a subtree trying to join the main) we can toss most of this shit away.
>
> For traditional unix file systems like NFS it is very hard to rebuild a
> path using a key (it is always very costly and usually breaks when the
> directory tree changes inbetween)
>
> >
> > And forget about open_by_inode() - you will not get it. It simply doesn't make
> > sense for most of filesystems. Now, an interface that would give you fhandle
> > by name/opened file and opened file by fhandle actually makes sense. But st_ino
> > from stat() is _not_ a suitable interface for that.
>
> The inode is just a key. To make the key usable for stateless fs like nfs
> you need to make the key a cookie that is valid over reboots. Now that "key"
> is traditionally the inode number in unix file systems. In reiserfs it
> is a aggregate of object id, directory id and possible offset. The only
> reason to stay with a 32bit "key" for ino_t is user space compatibility,
> but in kernel it should be really an arbitary length byte buffer (in
> practice the key is limited by the NFS spec and it makes sense to keep
> a static limit to avoid dynamic memory management)
>
> I guess you agree with me on that ?

I do.
>
> If we can meet at a open_by_key() function everything would be ok.
>
> [
> In 2.1 space I had a proposal (+patches) for "virtual hardlinks" that
> generated dentries from inums by just creating new hidden names for
> them that only existed in the dcache. Unfortunately it was rejected :/
> ]
>
> For the reiserfs work I added the following extensions to VFS.
> To super_operations:
>
> struct dentry *(*build_dentry)(struct super_block *,__u8 *key,int keylen
>
> To inode operations:
>
> int (*getkey) (struct dentry *, __u8 *key, int keylen);
>
> I guess your approach is similar ? It probably would make sense to merge
> our proposals early. If yours handles the reiserfs case cleanly (64bit
> keys) I would be happy to switch.

It would be nice if you can accomodate variable length keys for the sake of what
we want to do in the future. Our next major version will probably dump hashing
of filenames in keys and use variable length keys, so that we can do stem
compression in large directories.

>
> The problem the reiserfs implementation has is that it still deals very poorly
> with file system tree changes (it inherited that from the nfsfh algorithm).
> It is basically not a really robust solution, more a very elaborate hack :/
>
> -Andi
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.rutgers.edu
> Please read the FAQ at http://www.tux.org/lkml/

-- 
Get Linux (http://www.kernel.org) plus ReiserFS
 (http://devlinux.org/namesys).  If you sell an OS or
internet appliance, buy a port of ReiserFS!  If you
need customizations and industrial grade support, we sell them.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Mar 07 2000 - 21:00:13 EST