Re: NFS/d_splice_alias breakage

From: Oleg Drokin
Date: Thu Jun 02 2016 - 23:44:17 EST



On Jun 2, 2016, at 11:37 PM, Al Viro wrote:

> On Thu, Jun 02, 2016 at 06:46:08PM -0400, Oleg Drokin wrote:
>> Hello!
>>
>> I just came across a bug (trying to run some Lustre test scripts against NFS, while hunting for another nfsd bug)
>> that seems to be present since at least 2014 that lets users crash nfs client locally.
>
>>> * Cluster filesystems may call this function with a negative, hashed dentry.
>>> * In that case, we know that the inode will be a regular file, and also this
>>> * will only occur during atomic_open. So we need to check for the dentry
>>> * being already hashed only in the final case.
>
> Comment is long obsolete and should've been removed. "Cluster filesystem"
> in question was GFS2 and it had been dealt with there. Mea culpa - should've
> removed the comment as soon as that was done.

Oh, ok. I assumed it was still valid, esp. considering the issue at hand where
what it describes actually happens and NFS is also a cluster filesystem of sorts ;)

>> The problem was there at least since 3.10 it appears where the fs/nfs/dir.c code
>> was calling d_materialise_unique() that did require the dentry to be unhashed.
>>
>> Not sure how this was not hit earlier. The crash looks like this (I added
>> a printk to ensure this is what is going on indeed and not some other weird race):
>
>> [ 64.489326] Calling into d_splice_alias with hashed dentry, dentry->d_inode (null) inode ffff88010f500c70
>
> Which of the call sites had that been and how does one reproduce that fun?
> If you feel that posting a reproducer in the open is a bad idea, just send
> it off-list...

This is fs/nfs/dir.c::nfs_lookup() right after no_entry label.

I'll send you the scripts with instructions separately for now.