Re: [PATCH v2] fs: try an opportunistic lookup for O_CREAT opens too
From: Jan Kara
Date: Thu Aug 08 2024 - 13:11:46 EST
On Thu 08-08-24 12:36:07, Christian Brauner wrote:
> On Wed, Aug 07, 2024 at 10:36:58AM GMT, Jeff Layton wrote:
> > On Wed, 2024-08-07 at 16:26 +0200, Christian Brauner wrote:
> > > > +static struct dentry *lookup_fast_for_open(struct nameidata *nd, int open_flag)
> > > > +{
> > > > + struct dentry *dentry;
> > > > +
> > > > + if (open_flag & O_CREAT) {
> > > > + /* Don't bother on an O_EXCL create */
> > > > + if (open_flag & O_EXCL)
> > > > + return NULL;
> > > > +
> > > > + /*
> > > > + * FIXME: If auditing is enabled, then we'll have to unlazy to
> > > > + * use the dentry. For now, don't do this, since it shifts
> > > > + * contention from parent's i_rwsem to its d_lockref spinlock.
> > > > + * Reconsider this once dentry refcounting handles heavy
> > > > + * contention better.
> > > > + */
> > > > + if ((nd->flags & LOOKUP_RCU) && !audit_dummy_context())
> > > > + return NULL;
> > >
> > > Hm, the audit_inode() on the parent is done independent of whether the
> > > file was actually created or not. But the audit_inode() on the file
> > > itself is only done when it was actually created. Imho, there's no need
> > > to do audit_inode() on the parent when we immediately find that file
> > > already existed. If we accept that then this makes the change a lot
> > > simpler.
> > >
> > > The inconsistency would partially remain though. When the file doesn't
> > > exist audit_inode() on the parent is called but by the time we've
> > > grabbed the inode lock someone else might already have created the file
> > > and then again we wouldn't audit_inode() on the file but we would have
> > > on the parent.
> > >
> > > I think that's fine. But if that's bothersome the more aggressive thing
> > > to do would be to pull that audit_inode() on the parent further down
> > > after we created the file. Imho, that should be fine?...
> > >
> > > See https://gitlab.com/brauner/linux/-/commits/vfs.misc.jeff/?ref_type=heads
> > > for a completely untested draft of what I mean.
> >
> > Yeah, that's a lot simpler. That said, my experience when I've worked
> > with audit in the past is that people who are using it are _very_
> > sensitive to changes of when records get emitted or not. I don't like
> > this, because I think the rules here are ad-hoc and somewhat arbitrary,
> > but keeping everything working exactly the same has been my MO whenever
> > I have to work in there.
> >
> > If a certain access pattern suddenly generates a different set of
> > records (or some are missing, as would be in this case), we might get
> > bug reports about this. I'm ok with simplifying this code in the way
> > you suggest, but we may want to do it in a patch on top of mine, to
> > make it simple to revert later if that becomes necessary.
>
> Fwiw, even with the rearranged checks in v3 of the patch audit records
> will be dropped because we may find a positive dentry but the path may
> have trailing slashes. At that point we just return without audit
> whereas before we always would've done that audit.
>
> Honestly, we should move that audit event as right now it's just really
> weird and see if that works. Otherwise the change is somewhat horrible
> complicating the already convoluted logic even more.
>
> So I'm appending the patches that I have on top of your patch in
> vfs.misc. Can you (other as well ofc) take a look and tell me whether
> that's not breaking anything completely other than later audit events?
The changes look good as far as I'm concerned but let me CC audit guys if
they have some thoughts regarding the change in generating audit event for
the parent. Paul, does it matter if open(O_CREAT) doesn't generate audit
event for the parent when we are failing open due to trailing slashes in
the pathname? Essentially we are speaking about moving:
audit_inode(nd->name, dir, AUDIT_INODE_PARENT);
from open_last_lookups() into lookup_open().
Honza
--
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR