Re: knfsd 1.5 is released

H . J . Lu (hjl@valinux.com)
Thu, 7 Oct 1999 16:28:35 -0700


On Thu, Oct 07, 1999 at 02:40:13PM +1000, Neil Brown wrote:
> On Friday October 1, hjl@valinux.com wrote:
> >
> > It is too much change for me. I fixed it in a different way in
> > 1.5.2.
> >
> > --
> > H.J. Lu (hjl@gnu.org)
>
> "Fixed" is kind of an interesting verb....
>
> I have just been browsing through 1.5.2. Let me tell you some of what
> I found.
>
> In nfsd-2.2.12-2.mountd.patch:
>
> - There is a new export flag, NFSEXP_REJECTED.
> fh_verify will set it on any NFSEXP_NEGATIVE export entry that was
> used to reject a request (with ESTALE). Seems reasonable, but why?
> Lets see...
>
> - There is a new function, exp_get_rejected which searches through
> all exports for all clients looking for an NFSEXP_REJECTED export
> entry. If it finds one it removes it from the export table and
> returns it.
> exp_export (which implements the system call to add entries to the
> exports table) call this fairly early and if it gets back an entry
> it uses it instead of kallocing new memory for the new export
> entry.
>
> Why? well I guess this helps auto-expire negative entries, once
> they have been used at least once, but it seems a bit random. At
> the very least, some sort of LRU ordering would seem appropriate.
> I would prefer removals to be explicitly managed by some user-level
> process that has access to a last-used-time for each export entry.

I was concerned that too much memory might be consumed by the kernel.
I'd like to seee some kind of auto-expire negative entries. I took the
easiest way out. We should do better.

>
> But on to the more relevant stuff:
>
> - nfsd_netlink_notify ( which is called when an export entry cannot be
> found) constructs a message consisting of:
> . The sockaddr_in of the client that is making a request.
> . the knfs_fh file handle (the 32byte filehandle in a suitable
> structure) of the request
> . The path name of the file being accessed
> . The path name of the filesystem containing the file (these two
> are grouped together into a single char[] array.)
>
> Then in utils/mountd/mountd.c:
> nlfd_handler (which is given these messages when they arrive):
>
> - calls get_rootfh with the client address and file pathname, much
> as mountd does when responding to the MOUNTD_MNT request, which
> will try to find a relevant export entry and will feed it to the
> kernel. If this worked and the export entry has the same
> xdev/xino as the filehandle, then all is ok. This seems fine.
>
> But what happens if it doesn't find an appropriate mount entry,
> or if the the xdev/xino is wrong?
>
> First to clarify what needs to happen here:
> A NFSEXP_NEGATIVE export entry needs to be given to the kernel
> with an appropriate client, and with exactly the xdev/xino out
> of the filehandle. If not, then when this request is
> retransmitted, the same process will be repeated.
>
> So what does nlfd_handler do? First it stats the path to the
> file being accessed to see if it has the right xdev/xino. The
> chances of this are probably "poor" in general, though it might
> work in certain senarios.
> If it isn't the right xdev/xino, then it stats the filesystem
> that contained the filehandle, and assumes that this will have
> the right dev/ino. If it was the root of the filesystem that was
> exported, this will work fine (as would previous versions). If
> it wasn't then this will not have the right dev/ino.
>
> In either case, the dev and ino of the last stat is used to
> create a negative export entry which is feed to the kernel.
>
> So, what happens when we:
>
> have a filesystem /a
> export /a/b to fred
> on fred, mount /a/b as /ab
> on fred, cd /ab/c (which we assume exists)
> on server, unexport /a/b to fred
> on fred, ls -l .
>
> Fred will send some sort of request with the filehandle it got
> for /a/b/c. This will have a dev/ino referring to /a/b/c and an
> xdev/xino referring to /a/b.
>
> knfsd will pass the filehandle and path names up to mountd which
> will:
> try to see if /a/b/c (or any parent thereof) is currently
> exported to fred. It won't be.
> see if /a/b/c has the same dev/ino as the file handle. It wont.
> use the dev/ino for /a assuming it is right. It won't be.
>
> Then it gives a NFSEXP_NEGATIVE export for /a to the kernel.
>
> fred timesout and resends. knfsd will look for an export entry
> for client fred, referring to dev/ino of /a/b, and wont find one,
> so we go around the loop again.
>
>
> So maybe you can see how I wonder at your usage of the verb "fixed".

Here is a patch for the problem. It may not solve all cases. But I
think it covers most ones.

>
> Maybe we should have a bit of discussion (in nfs-devel) about
> whether this is all a good idea, what filehandles should look-like in
> order to support it best, whether the up-call should use RPC (as
> solaris does) and related issues before we flail around with too much
> alpha-quality code. There are other bits of knfsd that need some
> work first, such as the filehandle->dentry mapping and the issue of
> whether filehandles really need to contain the inode number of the
> parent.
>

I see 1.5.x as alpha. Nothing is final yet. The reason I like it is it
gets rid of /var/lib/nfs/rmtab. However, that means we have to find a
way for kernel to communicate to mountd. Maybe RPC is a better choice.
Anyone wants to implement it?

Thanks.

H.J.

---
Index: utils/mountd/mountd.c
===================================================================
RCS file: /home/cvsroot/knfsd/utils/mountd/mountd.c,v
retrieving revision 1.21
diff -u -p -r1.21 mountd.c
--- mountd.c	1999/10/01 20:37:05	1.21
+++ mountd.c	1999/10/07 22:43:45
@@ -51,6 +51,7 @@ void nlfd_handler(int fd)
 	struct stat stb;
 	int err;
 	char *p = buf.path;
+	char epath[MAXPATHLEN+1];
 
 	xlog(L_NOTICE, "NFS request from unknown host\n");
 
@@ -69,17 +70,41 @@ void nlfd_handler(int fd)
 		     fh->fh_xino);
 	}
 
-	/* Put in a negative export for the right directory. */
-	if (stat(p, &stb) < 0) {
-		xlog(L_ERROR, "Cannot stat %s for making negative exportent.",
-		     p);
-		return;
+	/* Put in a negative export for the right directory. We should
+	   find a match for dev/ino. Try the longest matching pathname.
+	 */
+	strncpy(epath, p, sizeof (epath) - 1);
+	epath[sizeof (epath) - 1] = '\0';
+	for (p = NULL;;) {
+		if (stat(epath, &stb) < 0) {
+			xlog(L_ERROR, "Cannot stat %s for making negative exportent.", p);
+			return;
+		}
+		if (stb.st_dev == buf.fh.fh_xdev
+		    && stb.st_ino == buf.fh.fh_xino) {
+			p = epath;
+			break;
+		}
+		else {
+			/* We have to treat the root, "/", specially. */
+			if (p == &epath[1]) break;
+			p = strrchr(epath, '/');
+			if (p == epath) p++;
+			*p = '\0';
+		}
 	}
+
 	if (stb.st_dev != buf.fh.fh_xdev
 	    || stb.st_ino != buf.fh.fh_xino) {
 		p = &buf.path[buf.devlen + 1];
 		if (stat(p, &stb) < 0) {
 			xlog(L_ERROR, "Cannot stat %s for making negative exportent.", p);
+			return;
+		}
+		if (stb.st_dev != buf.fh.fh_xdev
+		    || stb.st_ino != buf.fh.fh_xino) {
+			xlog(L_NOTICE, "No matching exported directory from nfsd netlink: %d/%d",
+			buf.fh.fh_xdev, buf.fh.fh_xino);
 			return;
 		}
 	}

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/