Re: [PATCH v2 1/2] seccomp: notify user trap about unused filter
From: Christian Brauner
Date: Fri May 29 2020 - 03:51:43 EST
On Fri, May 29, 2020 at 01:32:03AM +0200, Jann Horn wrote:
> On Fri, May 29, 2020 at 1:11 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> > On Thu, May 28, 2020 at 05:14:11PM +0200, Christian Brauner wrote:
> > > * @usage: reference count to manage the object lifetime.
> > > * get/put helpers should be used when accessing an instance
> > > * outside of a lifetime-guarded section. In general, this
> > > * is only needed for handling filters shared across tasks.
> > > [...]
> > > + * @live: Number of tasks that use this filter directly and number
> > > + * of dependent filters that have a non-zero @live counter.
> > > + * Altered during fork(), exit(), and filter installation
> > > [...]
> > > refcount_set(&sfilter->usage, 1);
> > > + refcount_set(&sfilter->live, 1);
> [...]
> > After looking at these other lifetime management examples in the kernel,
> > I'm convinced that tracking these states separately is correct, but I
> > remain uncomfortable about task management needing to explicitly make
> > two calls to let go of the filter.
> >
> > I wonder if release_task() should also detach the filter from the task
> > and do a put_seccomp_filter() instead of waiting for task_free(). This
> > is supported by the other place where seccomp_filter_release() is
> > called:
> >
> > > @@ -396,6 +400,7 @@ static inline void seccomp_sync_threads(unsigned long flags)
> > > * allows a put before the assignment.)
> > > */
> > > put_seccomp_filter(thread);
> > > + seccomp_filter_release(thread);
> >
> > This would also remove the only put_seccomp_filter() call outside of
> > seccomp.c, since the free_task() call will be removed now in favor of
> > the task_release() call.
> >
> > So, is it safe to detach the filter in release_task()? Has dethreading
> > happened yet? i.e. can we race TSYNC? -- is there a possible
> > inc-from-zero?
>
> release_task -> __exit_signal -> __unhash_process ->
> list_del_rcu(&p->thread_node) drops us from the thread list under
> siglock, which is the same lock TSYNC uses.
>
> One other interesting thing that can look at seccomp state is
> task_seccomp() in procfs - that can still happen at this point. At the
> moment, procfs only lets you see the numeric filter state, not the
> actual filter contents, so that's not a problem; but if we ever add a
> procfs interface for dumping seccomp filters (in addition to the
> ptrace interface that already exists), that's something to keep in
> mind.
Aside from this being not an issue now, can we please not dump seccomp
filter contents in proc. That sounds terrible and what's the rationale,
libseccomp already let's you dump filter contents while loading and you
could ptrace it. But maybe I'm missing a giant need for this...
Christian