Re: [PATCH] NFS: only invalidate dentrys that are clearly invalid.
From: NeilBrown
Date: Sun Nov 15 2020 - 23:44:30 EST
On Mon, Nov 16 2020, Trond Myklebust wrote:
> On Mon, 2020-11-16 at 13:59 +1100, NeilBrown wrote:
>>
>> Prior to commit 5ceb9d7fdaaf ("NFS: Refactor
>> nfs_lookup_revalidate()")
>> and error from nfs_lookup_verify_inode() other than -ESTALE would
>> result
>> in nfs_lookup_revalidate() returning that error code (-ESTALE is
>> mapped
>> to zero).
>> Since that commit, all errors result in zero being returned.
>>
>> When nfs_lookup_revalidate() returns zero, the dentry is invalidated
>> and, significantly, if the dentry is a directory that is mounted on,
>> that mountpoint is lost.
>>
>> If you:
>> - mount an NFS filesystem which contains a directory
>> - mount something (e.g. tmpfs) on that directory
>> - use iptables (or scissors) to block traffic to the server
>> - ls -l the-mounted-on-directory
>> - interrupt the 'ls -l'
>> you will find that the directory has been unmounted.
>>
>> This can be fixed by returning the actual error code from
>> nfs_lookup_verify_inode() rather then zero (except for -ESTALE).
>>
>> Fixes: 5ceb9d7fdaaf ("NFS: Refactor nfs_lookup_revalidate()")
>> Signed-off-by: NeilBrown <neilb@xxxxxxx>
>> ---
>> fs/nfs/dir.c | 8 +++++---
>> 1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
>> index cb52db9a0cfb..d24acf556e9e 100644
>> --- a/fs/nfs/dir.c
>> +++ b/fs/nfs/dir.c
>> @@ -1350,7 +1350,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> struct dentry *dentry,
>> unsigned int flags)
>> {
>> struct inode *inode;
>> - int error;
>> + int error = 0;
>>
>> nfs_inc_stats(dir, NFSIOS_DENTRYREVALIDATE);
>> inode = d_inode(dentry);
>> @@ -1372,8 +1372,10 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> struct dentry *dentry,
>> nfs_check_verifier(dir, dentry, flags & LOOKUP_RCU)) {
>> error = nfs_lookup_verify_inode(inode, flags);
>> if (error) {
>> - if (error == -ESTALE)
>> + if (error == -ESTALE) {
>> nfs_zap_caches(dir);
>> + error = 0;
>> + }
>> goto out_bad;
>> }
>> nfs_advise_use_readdirplus(dir);
>> @@ -1395,7 +1397,7 @@ nfs_do_lookup_revalidate(struct inode *dir,
>> struct dentry *dentry,
>> out_bad:
>> if (flags & LOOKUP_RCU)
>> return -ECHILD;
>> - return nfs_lookup_revalidate_done(dir, dentry, inode, 0);
>> + return nfs_lookup_revalidate_done(dir, dentry, inode, error);
>
> Which errors do we actually need to return here? As far as I can tell,
> the only errors that nfs_lookup_verify_inode() is supposed to return is
> ENOMEM, ESTALE, ECHILD, and possibly EIO or ETiMEDOUT.
>
> Why would it be better to return those errors rather than just a 0 when
> we need to invalidate the inode, particularly since we already have a
> special case in nfs_lookup_revalidate_done() when the dentry is root?
ERESTARTSYS is the error that easily causes problems.
Returning 0 causes d_invalidate() to be called which is quite heavy
handed in mountpoints.
So it is only reasonable to return 0 when we have unambiguous
confirmation from the server that the object no longer exists. ESTALE
is unambiguous. EIO might be unambiguous. ERESTARTSYS, ENOMEM,
ETIMEDOUT are transient and don't justify d_invalidate() being called.
(BTW, Commit cc89684c9a26 ("NFS: only invalidate dentrys that are clearly invalid.")
fixed much the same bug 3 years ago).
Thanks,
NeilBrown
>
>> }
>>
>> static int
>
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@xxxxxxxxxxxxxxx
Attachment:
signature.asc
Description: PGP signature