Re: [PATCH ALT4 V2 1/2] audit: show fstype:pathname for entries with anonymous parents

From: Richard Guy Briggs
Date: Mon Aug 14 2017 - 00:33:11 EST


On 2017-08-11 02:36, Richard Guy Briggs wrote:
> On 2017-06-28 15:03, Paul Moore wrote:
> > On Tue, Jun 27, 2017 at 5:11 PM, Richard Guy Briggs <rgb@xxxxxxxxxx> wrote:
> > > On 2017-05-30 17:21, Paul Moore wrote:
> > >> On Tue, Apr 4, 2017 at 5:21 AM, Richard Guy Briggs <rgb@xxxxxxxxxx> wrote:
> >
> > ...
> >
> > >> > diff --git a/kernel/audit.c b/kernel/audit.c
> > >> > index 25dd70a..7d83c5a 100644
> > >> > --- a/kernel/audit.c
> > >> > +++ b/kernel/audit.c
> > >> > @@ -66,6 +66,7 @@
> > >> > #include <linux/freezer.h>
> > >> > #include <linux/pid_namespace.h>
> > >> > #include <net/netns/generic.h>
> > >> > +#include <linux/dcache.h>
> > >> >
> > >> > #include "audit.h"
> > >> >
> > >> > @@ -1884,6 +1885,10 @@ void audit_copy_inode(struct audit_names *name, const struct dentry *dentry,
> > >> > name->gid = inode->i_gid;
> > >> > name->rdev = inode->i_rdev;
> > >> > security_inode_getsecid(inode, &name->osid);
> > >> > + if (name->dentry) {
> > >> > + dput(name->dentry);
> > >> > + name->dentry = NULL;
> > >> > + }
> > >>
> > >> Out of curiosity, what terrible things happen if we take a reference
> > >> to a non-NULL dentry passed to audit_copy_inode() and store it in
> > >> name->dentry? Does performance tank?
> > >
> > > Interesting idea. Right now it is optimized to only take a reference to
> > > the dentry's parent dentry in the case we're handed an anonymous entry.
> > > Most of the time it will never be used even though we invest in the
> > > overhead of taking a reference count. Besides, __audit_inode_child()
> > > hands in a NULL for the dentry parameter to audit_copy_inode().
> >
> > [NOTE: audit_copy_inode() hands a NULL dentry only in the anonymous parent case]
> >
> > I believe I was just thinking of less conditional handling, especially
> > when reference counts are concerned. I'm just trying to limit future
> > headaches, but I suspect the perf cost would be problematic, and as
> > you point out, there is no *need* for the majority of cases.
> >
> > Looking at this again today, why would we want to clear name->dentry
> > in audit_copy_inode() if it is already set? Does that ever happen?
> > I'm not sure it does ...
>
> Ok, I just tried re-writing that part to dput that dentry only for the
> child of an anonymous parent rather than whenever audit_copy_inode() was
> called and I ended up with a:
>
> BUG: Dentry ffff8800338f1dc0{i=369a,n=stderr} still in use (1) [unmount of tmpfs tmpfs]
> WARNING: CPU: 0 PID: 387 at fs/dcache.c:1445 umount_check+0x99/0xc0
>
> This was after rebasing on a recent audit-next based on 4.11 rather than
> the previous based on a 4.8 audit-next. Something else changed in the
> kernel or kernel config between these two points. I was getting some
> "INFO: suspicious RCU usage." warnings in idmap for NFS on the earlier
> kernel and that is no longer happenning, and I'm no longer getting any
> tracefs audit PATH entries in the more recent kernel which suggests that
> whatever was causing the anonymous parent PATH records from tracefs has
> been changed/fixed/configured-out. This tmpfs Dentry issue happens on
> both 4.8 and 4.11 kernels.

Minor correction here: I'm still getting tracefs PATH records on 4.11,
so whatever happenned in that test appears to be a test procedure error.

> So the tmpfs is being unmounted within the time of a task that has taken
> an anonymous parent audit_name and it appears that the dput in
> audit_copy_inode() called from elsewhere had reset it to avoid
> needlessly extending the life of this dentry.
>
> I see two obvious ways to solve this:
> - Return to freeing this dentry from audit_copy_inode()
> - Add tmpfs to the list of filesystems to not create PATH records
>
> I'm really not crazy aobut this second one unless I know why tmpfs is
> generating calls with anonymous parents.
>
> For reference, the rest of the call trace is:
> dump_stack+0x85/0xc9
> __warn+0xd1/0xf0
> ? d_genocide_kill+0x40/0x40
> warn_slowpath_null+0x1d/0x20
> umount_check+0x99/0xc0
> d_walk+0x10b/0x580
> ? do_one_tree+0x26/0x40
> do_one_tree+0x26/0x40
> shrink_dcache_for_umount+0x5d/0xd0
> generic_shutdown_super+0x1f/0xf0
> kill_litter_super+0x29/0x40
> deactivate_locked_super+0x43/0x70
> deactivate_super+0x88/0xa0
> cleanup_mnt+0x8e/0xe0
> __cleanup_mnt+0x12/0x20
> task_work_run+0x83/0xc0
> do_exit+0x45c/0xfe0
> ? syscall_trace_enter+0x2e4/0x400
> do_group_exit+0x68/0xe0
> SyS_exit_group+0x14/0x20
> do_syscall_64+0x82/0x270
> entry_SYSCALL64_slow_path+0x25/0x25
>
> > > I'm assuming you are hinting at also using that dentry to compare the
> > > audit_names entry, which I think it a bad idea since there could be
> > > multiple paths to access a dentry. I did orignially have another patch
> > > that would have tried to use that as well, which didn't seem to hurt,
> > > but I didn't think was worth upstreaming.
> >
> > No, I wasn't thinking that, the dev/inode numbers should be sufficient
> > in those cases I believe; I'm not sure the dentry would help us here.
> >
> > >> Also out of curiosity, why do we want to drop a dentry reference here
> > >> if one already exists?
> > >
> > > I think we want to drop a dentry reference here because this inode child
> > > could be a subsequent access to the same dentry with a full path,
> > > removing the need to cache this dentry information in the first place.
> >
> > Related to my comment above from today ... what code path please?
>
> So it appears that there is a code path that does free this dentry in a
> timely fashion to avoid needlessly extending its life and simply leaving
> it there until the audit_context is freed is too long.
>
> > >> > @@ -1925,6 +1930,17 @@ void audit_log_name(struct audit_context *context, struct audit_names *n,
> > >> > audit_log_n_untrustedstring(ab, n->name->name,
> > >> > n->name_len);
> > >> > }
> > >> > + } else if (n->dentry) {
> > >> > + char *fullpath;
> > >> > + const char *fullpathp;
> > >> > +
> > >> > + fullpath = kmalloc(PATH_MAX, GFP_KERNEL);
> > >> > + if (!fullpath)
> > >> > + return;
> > >>
> > >> I'm wondering if there is some value in still emitting the record if
> > >> the kmalloc() fails, just with the name field set as the unset "?"
> > >> value, e.g. "name=?". Thoughts?
> > >
> > > Possibly. We've got much bigger problems if that happens, but this
> > > sounds like a good defensive coding approach. I'm even tempted to call
> > > audit_panic().
> >
> > No audit_panic(). We've still got good information that we can
> > record, e.g. dev/inode numbers; let's just print "name=?" and go on
> > our way recording the rest of the information. This is in keeping
> > with the current audit_log_name() error handling.
> >
> > At the very least you need to clean up here instead of just returning.
> > As the patch currently stands I believe this will end up leaking an
> > audit_buffer.
>
> This has been fixed along with a fullpath kmalloc leak.
>
> > paul moore
>
> - RGB
>
> --
> Richard Guy Briggs <rgb@xxxxxxxxxx>
> Sr. S/W Engineer, Kernel Security, Base Operating Systems
> Remote, Ottawa, Red Hat Canada
> IRC: rgb, SunRaycer
> Voice: +1.647.777.2635, Internal: (81) 32635
>
> --
> Linux-audit mailing list
> Linux-audit@xxxxxxxxxx
> https://www.redhat.com/mailman/listinfo/linux-audit

- RGB

--
Richard Guy Briggs <rgb@xxxxxxxxxx>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635