Re: More parallel atomic_open/d_splice_alias fun with NFS and possibly more FSes.

From: Al Viro
Date: Sun Jul 03 2016 - 23:08:20 EST


On Sun, Jul 03, 2016 at 08:37:22PM -0400, Oleg Drokin wrote:

> Hmâ This dates to sometime in 2006 and my memory is a bit hazy here.
>
> I think when we called into the open, it went into fifo open and stuck there
> waiting for the other opener. Something like that. And we cannot really be stuck here
> because we are holding some locks that need to be released in predictable time.
>
> This code is actually unreachable now because the server never returns an openhandle
> for special device nodes anymore (there's a comment about it in current staging tree,
> but I guess you are looking at some prior version).
>
> I imagine device nodes might have represented a similar risk too, but it did not
> occur to me to test it separately and the testsuite does not do it either.
>
> Directories do not get stuck when you open them so they are ok and we can
> atomically open them too, I guess.
> Symlinks are handled specially on the server and the open never returns
> the actual open handle for those, so this path is also unreachable with those.

Hmm... How much does the safety of client depend upon the correctness of
server?

BTW, there's a fun issue in ll_revalidate_dentry(): there's nothing to
promise stability of ->d_parent in there, so uses of d_inode(dentry->d_parent)
are not safe. That's independent from parallel lookups, and it's hard
to hit, but AFAICS it's not impossible to oops there.

Anyway, for Lustre the analogue of that NFS problem is here:
} else if (!it_disposition(it, DISP_LOOKUP_NEG) &&
!it_disposition(it, DISP_OPEN_CREATE)) {
/* With DISP_OPEN_CREATE dentry will be
* instantiated in ll_create_it.
*/
LASSERT(!d_inode(*de));
d_instantiate(*de, inode);
}
AFAICS, this (on top of mainline) ought to work:

diff --git a/drivers/staging/lustre/lustre/llite/namei.c b/drivers/staging/lustre/lustre/llite/namei.c
index 5eba0eb..b8da5b4 100644
--- a/drivers/staging/lustre/lustre/llite/namei.c
+++ b/drivers/staging/lustre/lustre/llite/namei.c
@@ -581,9 +581,11 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry,
struct file *file, unsigned open_flags,
umode_t mode, int *opened)
{
+ DECLARE_WAIT_QUEUE_HEAD_ONSTACK(wq);
struct lookup_intent *it;
struct dentry *de;
long long lookup_flags = LOOKUP_OPEN;
+ bool switched = false;
int rc = 0;

CDEBUG(D_VFSTRACE, "VFS Op:name=%pd, dir="DFID"(%p),file %p,open_flags %x,mode %x opened %d\n",
@@ -603,11 +605,28 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry,
it->it_flags = (open_flags & ~O_ACCMODE) | OPEN_FMODE(open_flags);

/* Dentry added to dcache tree in ll_lookup_it */
+ if (!(open_flags & O_CREAT) && !d_unhashed(dentry)) {
+ d_drop(dentry);
+ switched = true;
+ dentry = d_alloc_parallel(dentry->d_parent,
+ &dentry->d_name, &wq);
+ if (IS_ERR(dentry)) {
+ rc = PTR_ERR(dentry);
+ goto out_release;
+ }
+ if (unlikely(!d_in_lookup(dentry))) {
+ rc = finish_no_open(file, dentry);
+ goto out_release;
+ }
+ }
+
de = ll_lookup_it(dir, dentry, it, lookup_flags);
if (IS_ERR(de))
rc = PTR_ERR(de);
else if (de)
dentry = de;
+ else if (switched)
+ de = dget(dentry);

if (!rc) {
if (it_disposition(it, DISP_OPEN_CREATE)) {
@@ -648,6 +667,10 @@ static int ll_atomic_open(struct inode *dir, struct dentry *dentry,
}

out_release:
+ if (unlikely(switched)) {
+ d_lookup_done(dentry);
+ dput(dentry);
+ }
ll_intent_release(it);
kfree(it);