Difference in NFS locking in 2.0 or 2.2, was:Bug

Jensen Allan AJE (aje@arcodan.dk)
18 Apr 1999 17:36:00 -0000


Hi,

Just a follow-up on what I've found:

>> I've just set up a diskless Linux system, where I boot a 486 off the network
>> (bootp, tftp, nfs). It worked like a charm - everything ran smooth and fine,
>> until I wanted to set the box up as a NIS client. That's when I ran into
>> trouble.
>> I set up the NIS domain (ypdomainset), ran portmap, then ypbind, and here
>> trouble showed up.
>> Every time I started ypbind (with debug uption ) it said "ypbind already
>> running (pid xxxx) - exiting.

I've tried several things, including upgrading ypbind to the latest version,
the kernel to 2.2.6 and everything else I had in my power to do.

Now I wanted to try to look into the source to see what exactly went wrong here,
and I managed to put the lockfile handling together in this little piece of
code (randomly ripped from ypbind source):

<code snippet START>

#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <fcntl.h>
#include <string.h>
#include <unistd.h>
#include <sys/stat.h>

int locktest (int fd, int type, off_t offset, int whence, off_t len)
{
struct flock lock;

lock.l_type = type; /* F_RDLCK or F_WRLCK */
lock.l_start = offset; /* byte offset, relative to l_whence */
lock.l_whence = whence; /* SEEK_SET, SEEK_CUR, SEEK_END */
lock.l_len = len; /* #bytes (0 means to EOF) */

if (fcntl(fd, F_GETLK, &lock) < 0)
{
printf("fcntl error\n");
exit (0);
}

if (lock.l_type == F_UNLCK)
return(0); /* false, region is not locked by another proc */
return(lock.l_pid); /* true, return pid of lock owner */
}

void main (void)
{
int fd;
int pid;

printf ("My pid : %d\n", getpid());
fd=open ("pidfile.txt", O_CREAT | O_RDWR,
S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
if ( fd != -1)
{
pid=locktest (fd, F_WRLCK, 0, SEEK_SET, 0);
printf ("File pidfile.txt locked by pid %d\n", pid);
}
else
{
printf ("Error in open()\n");
}
}

<code snippet END>

If I run this code, I get the pid of the running program and the pid of the
program which has the file locked.
If I run this on a ext2 volume, it tells me that the file is locked by pid 0 -
meaning, it's not locked.
If I run this from a 2.2.5 diskless client (which boots off a 2.2.6 server with
knfsd) however, it tells me that the file is locked - by the very same pid that
this program is running as!
What's causing even more headache is that it works like it's supposed to do
when I go to a 2.0.35-box I've got, mount an NFS volume from a 2.2.5 box and
run the program in the newly mounted directory, just like if it was executed
from a local filesystem.

So, scenario is:

2.2.6 standalone box - file not locked by any pid
2.2.6 server with knfsd to 2.2.5 client - file locked with same pid.
2.2.5 server with userspace nfsd to 2.0.35 client - file not locked by any pid
2.0.35 server with userspace nfsd to 2.0.36 client - file not locked by any pid

So.. Is this an interesting 2.2.x 'undetected feature', or is it on purpose?
If the latter, could someone please fix ypbind so that it'll be able to have
its lockfiles onto an NFS mounted volume under kernel 2.2.x?

Best regards,
_______________________________________________________________________
Allan Jensen Scientific Atlanta Arcodan A/S Phone +45 73122150
System Administrator Augustenborg Landevej 7 Direct +45 73122154
IT-Support DK-6400 Sonderborg Fax +45 74423907
aje@sciatl.dk http://www.arcodan.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/