Re: [PATCH v2] ceph: fix posix ACL hooks

From: Steven Whitehouse
Date: Tue Feb 04 2014 - 06:35:04 EST


On Mon, 2014-02-03 at 22:40 +0000, Al Viro wrote:
> > >> +static int gfs2_vfs_permission(struct dentry *dentry, struct inode *inode, int mask)
> > >> +{
> > >> + return gfs2_permission(inode, mask);
> > >> +}
> > >
> > > Er... You do realize that callers of gfs2_permission() tend to have
> > > the dentry in question, either directly or as ->d_parent of something
> > > they have?
> >
> > Not true. Look closer.
> >
> > Look at gfs2_lookupi() in particular, and check how it is called.
>
> Yeowch... gfs2_ok_to_move() is particulary nasty... WTF do we need
> it for and why is it not racy as hell?

Well, the original intent was to prevent us from moving a directory into
one of its subdirectories, and thus creating a loop. It is only called
when the rename is moving a directory, and when the parent directories
(source and destination) are different.

I know there is a problem there, recently reported by Bruce Fields which
he came across when looking at the d_splice_alias issue. The bug that
Bruce found only shows up in the clustered case, and not in the single
node case though.

Which particular race are you worried about? This check is covered by
the rename glock, which is basically a cluster wide version of the vfs
level ->s_vfs_rename_mutex.

To diverge from that topic for a moment, this thread has also brought
together some discussion on another issue which I've been pondering
recently.... that of whether the inode operations for get/set_xattr
should take a dentry or not. I had thought that we'd come to the
conclusion that 9p made it impossible to swap the current dentry
argument for an inode, and I was about to send a patch for selinux
support on clustered fs on that basis. However the discussion in this
thread has made me wonder whether that really is the case or not.... Al,
can you confirm whether your xattr-experimental patches are still under
active consideration?

The other question that I have relating to that side of things, is why
security_inode_permission() is called from __inode_permission() rather
than from generic_permission() ? Maybe there is a good reason, but I
can't immediately see what it is at the moment.

In response to the question elsewhere about GFS2 calling
gfs2_permission() after the vfs has already done its checks, that is
indeed down to needing to ensure that we have the cluster locks when
this check is called. More importantly to know that things haven't
changed since the VFS called the same function in case we've raced with
another node changing the permissions, for example. There are a number
of cases where we redo vfs level checks for this reason,

Steve.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/