Re: Regression in 5.1.20: Reading long directory fails

From: Jason L Tibbitts III
Date: Tue Sep 03 2019 - 21:51:03 EST


I asked the XFS folks who mentioned that the issues with 64 bit inodes
are old, constrained to larger filesystems than what I'm using, not an
issue with nfsv4, and not present on anything but 32bit clients with old
userspace.

In any case, I have been experimenting a bit and somehow the issue seems
to be related to exporting with sec=krb5i:krb5p or sec=krb5i. If I
export with just sec=krb5p, things magically begin to work.

So basically:

[root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts
7685
nas00:/export/misc-00/tester /home/tester nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77 0 0

(unmount, then re-export with krb5i on the server)

[root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts
ls: reading directory '/home/tester': Input/output error
5623
nas00:/export/misc-00/tester /home/tester nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5i,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77 0 0

(umount, then re-export with krb5i:krb5p on the server)

[root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts
ls: reading directory '/home/tester': Input/output error
5623
nas00:/export/misc-00/tester /home/tester nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5i,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77 0 0

(umount, switch back to plain krb5p)

[root@ld00 ~]# ls -l ~tester|wc -l; grep tester /proc/mounts
7685
nas00:/export/misc-00/tester /home/tester nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=krb5p,clientaddr=172.21.84.191,local_lock=none,addr=172.21.86.77 0 0

Sometimes the number of files it lists before it fails changes (and in
this case has been as small as a few hundred) but I don't know what
causes it to change.

Anyway, I hope this helps to pinpoint the problem. I now have a really
easy way to reproduce this without having to kick people off of the
server, and if the successes aren't just some kind of false positives
then I guess I also have a workaround. I'm still at a loss as to why a
revert of the readdir changes makes any difference at all here.

- J<