Re: [PATCH v2] fs: try an opportunistic lookup for O_CREAT opens too
From: Jeff Layton
Date: Thu Aug 08 2024 - 06:54:15 EST
On Thu, 2024-08-08 at 12:36 +0200, Christian Brauner wrote:
> On Wed, Aug 07, 2024 at 10:36:58AM GMT, Jeff Layton wrote:
> > On Wed, 2024-08-07 at 16:26 +0200, Christian Brauner wrote:
> > > > +static struct dentry *lookup_fast_for_open(struct nameidata *nd, int open_flag)
> > > > +{
> > > > + struct dentry *dentry;
> > > > +
> > > > + if (open_flag & O_CREAT) {
> > > > + /* Don't bother on an O_EXCL create */
> > > > + if (open_flag & O_EXCL)
> > > > + return NULL;
> > > > +
> > > > + /*
> > > > + * FIXME: If auditing is enabled, then we'll have to unlazy to
> > > > + * use the dentry. For now, don't do this, since it shifts
> > > > + * contention from parent's i_rwsem to its d_lockref spinlock.
> > > > + * Reconsider this once dentry refcounting handles heavy
> > > > + * contention better.
> > > > + */
> > > > + if ((nd->flags & LOOKUP_RCU) && !audit_dummy_context())
> > > > + return NULL;
> > >
> > > Hm, the audit_inode() on the parent is done independent of whether the
> > > file was actually created or not. But the audit_inode() on the file
> > > itself is only done when it was actually created. Imho, there's no need
> > > to do audit_inode() on the parent when we immediately find that file
> > > already existed. If we accept that then this makes the change a lot
> > > simpler.
> > >
> > > The inconsistency would partially remain though. When the file doesn't
> > > exist audit_inode() on the parent is called but by the time we've
> > > grabbed the inode lock someone else might already have created the file
> > > and then again we wouldn't audit_inode() on the file but we would have
> > > on the parent.
> > >
> > > I think that's fine. But if that's bothersome the more aggressive thing
> > > to do would be to pull that audit_inode() on the parent further down
> > > after we created the file. Imho, that should be fine?...
> > >
> > > See https://gitlab.com/brauner/linux/-/commits/vfs.misc.jeff/?ref_type=heads
> > > for a completely untested draft of what I mean.
> >
> > Yeah, that's a lot simpler. That said, my experience when I've worked
> > with audit in the past is that people who are using it are _very_
> > sensitive to changes of when records get emitted or not. I don't like
> > this, because I think the rules here are ad-hoc and somewhat arbitrary,
> > but keeping everything working exactly the same has been my MO whenever
> > I have to work in there.
> >
> > If a certain access pattern suddenly generates a different set of
> > records (or some are missing, as would be in this case), we might get
> > bug reports about this. I'm ok with simplifying this code in the way
> > you suggest, but we may want to do it in a patch on top of mine, to
> > make it simple to revert later if that becomes necessary.
>
> Fwiw, even with the rearranged checks in v3 of the patch audit records
> will be dropped because we may find a positive dentry but the path may
> have trailing slashes. At that point we just return without audit
> whereas before we always would've done that audit.
>
I don't think so. v3 has it drop out of rcuwalk and do the audit_inode
call in the case of trailing slashes. I took great pains here to ensure
that if we emitted a record before that we still do it after. Do let me
know if I missed a case though.
> Honestly, we should move that audit event as right now it's just really
> weird and see if that works. Otherwise the change is somewhat horrible
> complicating the already convoluted logic even more.
>
> So I'm appending the patches that I have on top of your patch in
> vfs.misc. Can you (other as well ofc) take a look and tell me whether
> that's not breaking anything completely other than later audit events?
That all looks fine to me. I'm not a heavy user of audit, but the
change you've made seems sane to me. Hopefully no one will notice.
I'll plan to do some testing with it later today.
Thanks!
--
Jeff Layton <jlayton@xxxxxxxxxx>