[PATCH] seccomp: avoid the lock trip in seccomp_filter_release in common case

From: Mateusz Guzik
Date: Mon Feb 10 2025 - 16:05:58 EST


Vast majority of threads don't have any seccomp filters, all while the
lock taken here is shared between all threads in given process and
frequently used.

Signed-off-by: Mateusz Guzik <mjguzik@xxxxxxxxx>
---

Here is a splat from parallel thread creation/destruction within onep
rocess:

bpftrace -e 'kprobe:__pv_queued_spin_lock_slowpath { @[kstack()] = count(); }'

[snip]
@[
__pv_queued_spin_lock_slowpath+5
_raw_spin_lock_irq+42
seccomp_filter_release+32
do_exit+286
__x64_sys_exit+27
x64_sys_call+4703
do_syscall_64+82
entry_SYSCALL_64_after_hwframe+118
]: 475601
@[
__pv_queued_spin_lock_slowpath+5
_raw_spin_lock_irq+42
acct_collect+77
do_exit+1380
__x64_sys_exit+27
x64_sys_call+4703
do_syscall_64+82
entry_SYSCALL_64_after_hwframe+118
]: 478335
@[
__pv_queued_spin_lock_slowpath+5
_raw_spin_lock_irq+42
sigprocmask+106
__x64_sys_rt_sigprocmask+121
do_syscall_64+82
entry_SYSCALL_64_after_hwframe+118
]: 1825572

There are more spots which take the same lock, with seccomp being top 3.

I could not be bothered to bench before/after, but I can do it if you
insist. The fact that this codepath is a factor can be seen above.

This is a minor patch, I'm not going to insist on it.

To my reading seccomp only ever gets populated for current, so this
should be perfectly safe to test on exit without any synchronisation.

This may need a data_race annotation if some tooling decides to protest.

kernel/seccomp.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 7bbb408431eb..c839674966e2 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -576,6 +576,9 @@ void seccomp_filter_release(struct task_struct *tsk)
if (WARN_ON((tsk->flags & PF_EXITING) == 0))
return;

+ if (tsk->seccomp.filter == NULL)
+ return;
+
spin_lock_irq(&tsk->sighand->siglock);
orig = tsk->seccomp.filter;
/* Detach task from its filter tree. */
--
2.43.0