[RFC PATCH 0/2] disable optimistic spinning for ftrace_lock

From: Yafang Shao

Date: Wed Mar 04 2026 - 02:48:02 EST


Background
==========

One of our latency-sensitive services reported random CPU %sys spikes.
After a thorough investigation, we finally identified the root cause of the
CPU %sys spikes. The key kernel stacks are as follows:

- Task A

2026-02-14-16:53:40.938243: [CPU198] 2156302(bpftrace) cgrp:4019437 pod:4019253

find_kallsyms_symbol+142
module_address_lookup+104
kallsyms_lookup_buildid+203
kallsyms_lookup+20
print_rec+64
t_show+67
seq_read_iter+709
seq_read+165
vfs_read+165
ksys_read+103
__x64_sys_read+25
do_syscall_64+56
entry_SYSCALL_64_after_hwframe+100

This task (2156302, bpftrace) is reading the
/sys/kernel/debug/tracing/available_filter_functions to check if a function
is traceable:

https://github.com/bpftrace/bpftrace/blob/master/src/tracefs/tracefs.h#L21

Reading the available_filter_functions file is time-consuming, as it
contains tens of thousands of functions:

$ cat /sys/kernel/debug/tracing/available_filter_functions | wc -l
61081

$ time cat /sys/kernel/debug/tracing/available_filter_functions > /dev/null

real 0m0.452s
user 0m0.000s
sys 0m0.452s

Consequently, the ftrace_lock is held by this task for an extended period.

- Other Tasks

2026-02-14-16:53:41.437094: [CPU79] 2156308(bpftrace) cgrp:4019437 pod:4019253

mutex_spin_on_owner+108
__mutex_lock.constprop.0+1132
__mutex_lock_slowpath+19
mutex_lock+56
t_start+51
seq_read_iter+250
seq_read+165
vfs_read+165
ksys_read+103
__x64_sys_read+25
do_syscall_64+56
entry_SYSCALL_64_after_hwframe+100

Since ftrace_lock is held by Task-A and Task-A is actively running on a
CPU, all other tasks waiting for the same lock will spin on their
respective CPUs. This leads to increased CPU pressure.

Reproduction
============

This issue can be reproduced simply by running
`cat available_filter_functions`.

- Single process reading available_filter_functions:

$ time cat /sys/kernel/tracing/available_filter_functions > /dev/null

real 0m0.452s
user 0m0.001s
sys 0m0.451s

- Six processes reading available_filter_functions simultaneously:

for i in `seq 0 5`; do
time cat /sys/kernel/tracing/available_filter_functions > /dev/null &
done

The results are as follows:

real 0m1.801s
user 0m0.000s
sys 0m1.779s

real 0m1.804s
user 0m0.001s
sys 0m1.791s

real 0m1.805s
user 0m0.000s
sys 0m1.792s

real 0m1.806s
user 0m0.001s
sys 0m1.796s

As more processes are added, the system time increases correspondingly.

Solution
========

One approach is to optimize the reading of available_filter_functions to
make it as fast as possible. However, the risk lies in the contention
caused by optimistic spin locking.

Therefore, we need to consider an alternative solution that avoids
optimistic spinning for heavy mutexes that may be held for long durations.
Note that we do not want to disable CONFIG_MUTEX_SPIN_ON_OWNER entirely, as
that could lead to unexpected performance regressions.

In this patch, a new wrapper mutex_lock_nospin() is used for ftrace_lock to
selectively disable optimistic spinning.

Yafang Shao (2):
locking: add mutex_lock_nospin()
ftrace: disable optimistic spinning for ftrace_lock

include/linux/mutex.h | 3 +++
kernel/locking/mutex.c | 39 +++++++++++++++++++++++++------
kernel/trace/ftrace.c | 52 +++++++++++++++++++++---------------------
3 files changed, 61 insertions(+), 33 deletions(-)

--
2.47.3