Re: [PATCH v2 1/2] seccomp: notify user trap about unused filter
From: Christian Brauner
Date: Fri May 29 2020 - 04:50:34 EST
On Fri, May 29, 2020 at 01:32:03AM +0200, Jann Horn wrote:
> On Fri, May 29, 2020 at 1:11 AM Kees Cook <keescook@xxxxxxxxxxxx> wrote:
> > On Thu, May 28, 2020 at 05:14:11PM +0200, Christian Brauner wrote:
> > > * @usage: reference count to manage the object lifetime.
> > > * get/put helpers should be used when accessing an instance
> > > * outside of a lifetime-guarded section. In general, this
> > > * is only needed for handling filters shared across tasks.
> > > [...]
> > > + * @live: Number of tasks that use this filter directly and number
> > > + * of dependent filters that have a non-zero @live counter.
> > > + * Altered during fork(), exit(), and filter installation
> > > [...]
> > > refcount_set(&sfilter->usage, 1);
> > > + refcount_set(&sfilter->live, 1);
> [...]
> > After looking at these other lifetime management examples in the kernel,
> > I'm convinced that tracking these states separately is correct, but I
> > remain uncomfortable about task management needing to explicitly make
> > two calls to let go of the filter.
> >
> > I wonder if release_task() should also detach the filter from the task
> > and do a put_seccomp_filter() instead of waiting for task_free(). This
> > is supported by the other place where seccomp_filter_release() is
> > called:
> >
> > > @@ -396,6 +400,7 @@ static inline void seccomp_sync_threads(unsigned long flags)
> > > * allows a put before the assignment.)
> > > */
> > > put_seccomp_filter(thread);
> > > + seccomp_filter_release(thread);
> >
> > This would also remove the only put_seccomp_filter() call outside of
> > seccomp.c, since the free_task() call will be removed now in favor of
> > the task_release() call.
> >
> > So, is it safe to detach the filter in release_task()? Has dethreading
> > happened yet? i.e. can we race TSYNC? -- is there a possible
> > inc-from-zero?
>
> release_task -> __exit_signal -> __unhash_process ->
> list_del_rcu(&p->thread_node) drops us from the thread list under
> siglock, which is the same lock TSYNC uses.
We should move us after write_unlock_irq(&tasklist_lock). We're after
__exit_signal() so we're unhashed and can't be discovered by tsync too
anymore and we also don't require the tasklist_lock to be held:
diff --git a/kernel/exit.c b/kernel/exit.c
index b332e3635eb5..5490cc04f436 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -193,8 +193,6 @@ void release_task(struct task_struct *p)
cgroup_release(p);
write_lock_irq(&tasklist_lock);
ptrace_release_task(p);
thread_pid = get_pid(p->thread_pid);
@@ -220,6 +218,7 @@ void release_task(struct task_struct *p)
}
write_unlock_irq(&tasklist_lock);
+ seccomp_filter_release(p);
proc_flush_pid(thread_pid);
put_pid(thread_pid);
release_thread(p);
Christian