2.1.50 cannot get CWD

Keith Owens (kaos@ocs.com.au)
Sat, 16 Aug 1997 23:37:41 +1000


With 2.1.49 and 2.1.50 (and probably earlier) if you mount an NFS drive
from another machine and that machine dies then lstat against the mount
point will fail intermittently. sys_newlstat calls do_revalidate which
eventually calls an NFS function which sometimes decides to talk to the
other machine.

The problem is that the getcwd algorithm uses lstat to walk up the
directory tree. If the NFS mount point is in a higher directory and it
appears physically before the CWD or its parents then getcwd can fail
with I/O error. Scripts and programs break intermittently, depending
on which NFS mount point is dead, where the CWD is anchored and the
time since last validation.

pre-2.0.31-6 does not have this problem. Kill an NFS mount point and
getcwd works fine. You only get I/O error if you try to traverse the
NFS tree proper, lstat on the mount point is fine.