Re: [PATCH] proc: revalidate dentry returned by proc_pid_follow_link

From: Eric W. Biederman
Date: Sun Nov 08 2009 - 05:16:27 EST


Jeff Layton <jlayton@xxxxxxxxxx> writes:

> On Fri, 6 Nov 2009 20:36:01 +0000
> Jamie Lokier <jamie@xxxxxxxxxxxxx> wrote:
>
>> Jeff Layton wrote:
>> > The problem here is that this makes that code shortcut any lookup or
>> > revalidation of the dentry. In general, this isn't a problem -- in most
>> > cases the dentry is known to be good. It is a problem however for NFSv4.
>> > If this symlink is followed on an open operation no actual open call
>> > occurs and the open state isn't properly established. This causes
>> > problems when we later try to use this file descriptor for actual
>> > operations.
>>
>> As NFS uses open() as a kind of fcntl-lock barrier, I can see it's
>> important to do _something_ on new opens, rather than just cloning
>> most of the file descriptor.
>>
>
> I guess you mean the close-to-open cache consistency? If so, this
> problem doesn't actually break that. The actual nfs_file_open call does
> occur even when you're opening by following one of these symlinks. I
> believe the cache consistency code occurs there.
>
> The problem here is really nfsv4 specific. There the on-the-wire open
> call and initialization of state actually happens during d_lookup and
> d_revalidate. Neither of these happens with these LAST_BIND symlinks so
> we end up with a filp that has no NFSv4 state attached.
>
>> > This patch takes a minimalist approach to fixing this by making the
>> > /proc/pid follow_link routine revalidate the dentry before returning it.
>>
>> What happens if the file descriptor you are re-opening is for a file
>> which has been deleted. Does it still have a revalidatable dentry?
>>
>
> Well, these LAST_BIND symlinks return a real dget'ed dentry today. If
> we assume that it always returns a valid dentry (which seems to be the
> case), then I suppose it's OK to do a d_revalidate against it.
>
> It's possible though that that revalidate will either fail though or
> return that it's no good. In that case, this code just returns ESTALE
> which should make the path walking code revalidate all the way up the
> chain. That should (hopefully) make whatever syscall we're servicing
> return an error.

Hmm. Looking at the code I get the impression that a file bind mount
will have exactly the same problem.

Can you confirm.

If file bind mounts also have this problem a bugfix to to just
proc seems questionable.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/