Re: [RFC] symlink caching and network filesystems

Alexander Viro (viro@math.psu.edu)
Mon, 7 Jun 1999 12:29:27 -0400 (EDT)


On Mon, 7 Jun 1999, David S. Miller wrote:

> Date: Mon, 7 Jun 1999 02:33:35 -0400 (EDT)
> From: Alexander Viro <viro@math.psu.edu>
>
> On Mon, 7 Jun 1999, Erez Zadok wrote:
>
> > I'm not sure this is relevant, but please ensure that there's a
> > way for mount() to turn off such symlink caching. It's useful
> > for my user-level hlfsd to turn off symlink caching, esp. over
> > nfs. (Hlfsd creates a symlink that points to different places
> > depending on the euid accessing the symlink; it's used to
> > redirect /var/mail to users' home dirs.)
>
> Wait a minute. Are you talking about the userland server + normal
> NFS client combination? <scratching head> Hrrrmm... But READLINK in
> NFSv2 (and IIRC in v3 too) doesn't pass credentials, right? Could
> you elaborate on that?
>
> This is a non-issue. Alexander, if I understand your proposal
> correctly, the hlfsd issue is solved by using an attribute timeout of
> zero. NFS must revalidate the inode on every access, so the symlink
> cache validity is a by-product of inode revalidation. See?

Yes, that will work, but I'm still curious what did Erez do ;-) Anyway, I
think that I did the Rigth Thing(tm) with name handling and it solved the
problem in a different way. Basically it's a new structure, holding a
pointer to name + destructor + data for destructor + housekeeping
information. ->k_readlink() fills such beast. After lookup_dentry() (in
case of do_follow_link) or memcpy() (nfsd_readlink) or copy_to_user()
(sys_readlink) we release the structure or leave it around - depends on
policy. My primary goal here was to kill recursion in lookup_dentry() and
that construction does the trick quite fine. I suspect that getname() &co
can be also done via that mechanism, but I didn't check it yet. Anyway,
the patch simplifies the things big way - typical size of symlink.c is
about 800 bytes now and code duplication is gone. All symlink caching is
done in VFS.

> I'm doing some performance work on client side NFS right now, things
> look extremely promising so far, and I'll post my results (in the form
> of a patch against 2.3.x) when I have my first pass working.

[snip]

> 2) NFS dirent information lives in the page cache, no funny fixed size
> cache with funny invalidation semantics, truncate_inode_pages does
> it all, and thus NFS inode revalidation hands the grot of keeping
> dirent information from going stale as it does for all other
> attributes already.
>
> As a nice bonus this means the direct single copy overhead obtained
> in #1 happens for dirent information too.
>
> What you see on the wire for #2 is just an inode getattr for a dirent,
> in the worst case (the attribute timeout was hit). Right now if you
> obtain a working set of more than 64 directories, the network gets
> flooded with RPC requests because of our fixed NFS dirent cache size
> (actually it's less than 64 because those cache entries can each hold
> less than the full contents of a directory if the dirent information
> exceeds rsize).

Very nice. I'ld really like to see it ;-)

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/