[BUG] knfsd causes file system corruption when files are locked.

From: Ivan Kanis (ivank@wrq.com)
Date: Wed Nov 15 2000 - 18:24:39 EST


[1.] knfsd causes file system corruption when files are locked.

[2.] Lock down a file using the NLM_SHARE sharing mechanism. Remove
the file. Unlock the file using NLM_UNSHARE. The filesystem does not
recover the file space. I am running this on ext2fs. Fsck-ing the
filesystem does not help. The only way to recover the space is to
reformat the partition.

[3.] knfsd, lock, NLM_SHARE, NLM_UNSHARE

[4.] Linux version 2.2.16 (root@jedi) (gcc version 2.7.2.3)

[5.] N/A

[6.] This test.c program will reproduce the problem. You need to compile it
on a Solaris machine because Linux fcntl does not support NLM_SHARE.

-----start here
#include <fcntl.h>
#include <errno.h>
#include <stdio.h>
  
int main (int argc, char *argv[])
{
  struct fshare lck;
  int fd, ret;
  if (argc != 2) {
    printf ("Usage: %s file to lock\n", argv[0]);
    return 1;
  }
  fd = open (argv[1], O_WRONLY);
  memset (&lck, 0, sizeof (struct flock));
  lck.f_access = F_WRACC;
  lck.f_deny = F_NODNY;
  ret = fcntl (fd, F_SHARE, &lck);
  unlink (argv[1]);
  ret = fcntl (fd, F_UNSHARE, &lck);
 
  return 0;
}
-----end here

Step to reproduce the problem

- Compile the program:
gcc test.c -o test

- Mount a Linux nfs partition on Solaris: (Remember the partition will
get corrupted, use a partition that you don't care about.)
mount -o vers=2 jedi:/sandbox /mnt

- Create a chunk of data on /mnt
dd if=/dev/zero of=/mnt/chunk count=10000

- Do a df before running the program

- Run the test program
./test /mnt/chunk

- Run df again. The free space reamains the same. The space is gone
till you reformat the partition.

[7.] This bug was seen on a Debian 2.2 machine. We have seen the same
thing happens on systems running Red Hat 6.2 and TurboLinux 6.0 distributions.

[7.1] Environment:
Kernel modules found
Gnu C 2.95.2
Binutils 2.9.5.0.37
Linux C Library ..
Dynamic Linker (ld.so) 1.9.11
ls: /usr/lib/libg++.so: No such file or directory
Procps 2.0.6
Mount 2.10f
Net-tools (1999-04-20)
Kbd 0.99
Sh-utils 2.0
Sh-utils Parker.
Sh-utils
Sh-utils Inc.
Sh-utils NO
Sh-utils PURPOSE.

[7.2] Processor information

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 5
model name : Pentium II (Deschutes)
stepping : 2
cpu MHz : 447.700
cache size : 512 KB
fdiv_bug : no
hlt_bug : no
sep_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 2
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr
bogomips : 891.29

[7.3] Module information

aic7xxx 124328 1
nfsd 140436 8 (autoclean)
snd-pcm-oss 16840 1 (autoclean)
snd-pcm-plugin 13000 0 (autoclean) [snd-pcm-oss]
snd-mixer-oss 4308 1 (autoclean) [snd-pcm-oss]
snd-card-cs4236 5224 2
snd-mpu401-uart 2356 0 [snd-card-cs4236]
snd-rawmidi 9752 0 [snd-mpu401-uart]
snd-seq-device 3476 0 [snd-rawmidi]
isapnp 27572 0 [snd-card-cs4236]
snd-cs4236 20580 0 [snd-card-cs4236]
snd-cs4231 19008 0 [snd-card-cs4236 snd-cs4236]
snd-mixer 23536 0 [snd-mixer-oss snd-cs4236 snd-cs4231]
snd-pcm 29784 0 [snd-pcm-oss snd-pcm-plugin snd-cs4231]
snd-opl3 4328 0 [snd-card-cs4236]
snd-timer 8224 0 [snd-cs4231 snd-pcm snd-opl3]
snd-hwdep 3052 0 [snd-opl3]
snd 36300 1 [snd-pcm-oss snd-pcm-plugin snd-mixer-oss snd-card-cs4236 snd-mpu401-uart snd-rawmidi snd-seq-device snd-cs4236 snd-cs4231 snd-mixer snd-pcm snd-opl3 snd-timer snd-hwdep]
soundcore 2448 3 [snd]
3c59x 18212 1

[7.4] SCSI Information

Attached devices:
Host: scsi0 Channel: 00 Id: 05 Lun: 00
  Vendor: NEC Model: CD-ROM DRIVE:465 Rev: 1.03
  Type: CD-ROM ANSI SCSI revision: 02

[7.5] N/A

[8.] Here is a trace from the Solaris snoop program while the test
program mentioned above is running:

         sun -> jedi.wrq.com NFS C LOOKUP2 FH=2344 chunk
jedi.wrq.com -> sun NFS R LOOKUP2 OK FH=D308
         sun -> jedi.wrq.com NLM C SHARE3 OH=0009 FH=D308 Mode=0 Access=2
jedi.wrq.com -> sun NLM R SHARE3 OH=0009 granted 0
         sun -> jedi.wrq.com NLM C UNSHARE3 OH=000A FH=D308 Mode=0 Access=2
jedi.wrq.com -> sun NLM R UNSHARE3 OH=000A granted 0
         sun -> jedi.wrq.com NFS C GETATTR2 FH=2344
jedi.wrq.com -> sun NFS R GETATTR2 OK
         sun -> jedi.wrq.com NFS C LOOKUP2 FH=2344 chunk
jedi.wrq.com -> sun NFS R LOOKUP2 OK FH=D308
         sun -> jedi.wrq.com NFS C LOOKUP2 FH=2344 .nfs5FC7
jedi.wrq.com -> sun NFS R LOOKUP2 No such file or directory
         sun -> jedi.wrq.com NFS C RENAME2 FH=2344 chunk to FH=2344 .nfs5FC7
jedi.wrq.com -> sun NFS R RENAME2 OK
         sun -> jedi.wrq.com NLM C UNSHARE3 OH=000B FH=D308 Mode=0 Access=1
jedi.wrq.com -> sun NLM R UNSHARE3 OH=000B granted 0
         sun -> jedi.wrq.com NFS C REMOVE2 FH=2344 .nfs5FC7
jedi.wrq.com -> sun NFS R REMOVE2 OK

/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Wed Nov 15 2000 - 21:00:29 EST