Re: [PATCH v3] tracing: Guard __DECLARE_TRACE() use of __DO_TRACE_CALL() with SRCU-fast

From: Joel Fernandes

Date: Thu Dec 11 2025 - 19:12:19 EST




On 12/11/2025 3:23 PM, Paul E. McKenney wrote:
> On Thu, Dec 11, 2025 at 08:02:15PM +0000, Joel Fernandes wrote:
>>
>>
>>> On Dec 8, 2025, at 1:20 PM, Paul E. McKenney <paulmck@xxxxxxxxxx> wrote:
>>>
>>> The current use of guard(preempt_notrace)() within __DECLARE_TRACE()
>>> to protect invocation of __DO_TRACE_CALL() means that BPF programs
>>> attached to tracepoints are non-preemptible. This is unhelpful in
>>> real-time systems, whose users apparently wish to use BPF while also
>>> achieving low latencies. (Who knew?)
>>>
>>> One option would be to use preemptible RCU, but this introduces
>>> many opportunities for infinite recursion, which many consider to
>>> be counterproductive, especially given the relatively small stacks
>>> provided by the Linux kernel. These opportunities could be shut down
>>> by sufficiently energetic duplication of code, but this sort of thing
>>> is considered impolite in some circles.
>>>
>>> Therefore, use the shiny new SRCU-fast API, which provides somewhat faster
>>> readers than those of preemptible RCU, at least on Paul E. McKenney's
>>> laptop, where task_struct access is more expensive than access to per-CPU
>>> variables. And SRCU-fast provides way faster readers than does SRCU,
>>> courtesy of being able to avoid the read-side use of smp_mb(). Also,
>>> it is quite straightforward to create srcu_read_{,un}lock_fast_notrace()
>>> functions.
>>>
>>> While in the area, SRCU now supports early boot call_srcu(). Therefore,
>>> remove the checks that used to avoid such use from rcu_free_old_probes()
>>> before this commit was applied:
>>>
>>> e53244e2c893 ("tracepoint: Remove SRCU protection")
>>>
>>> The current commit can be thought of as an approximate revert of that
>>> commit, with some compensating additions of preemption disabling.
>>> This preemption disabling uses guard(preempt_notrace)().
>>>
>>> However, Yonghong Song points out that BPF assumes that non-sleepable
>>> BPF programs will remain on the same CPU, which means that migration
>>> must be disabled whenever preemption remains enabled. In addition,
>>> non-RT kernels have performance expectations that would be violated by
>>> allowing the BPF programs to be preempted.
>>>
>>> Therefore, continue to disable preemption in non-RT kernels, and protect
>>> the BPF program with both SRCU and migration disabling for RT kernels,
>>> and even then only if preemption is not already disabled.
>>
>> Hi Paul,
>>
>> Is there a reason to not make non-RT also benefit from SRCU fast and trace points for BPF? Can be a follow up patch though if needed.
>
> Because in some cases the non-RT benefit is suspected to be negative
> due to increasing the probability of preemption in awkward places.
>

Since you mentioned suspected, I am guessing there is no concrete data collected
to substantiate that specifically for BPF programs, but correct me if I missed
something. Assuming you're referring to latency versus tradeoffs issues, due to
preemption, Android is not PREEMPT_RT but is expected to be low latency in
general as well. So is this decision the right one for Android as well,
considering that (I heard) it uses BPF? Just an open-ended question.

There is also issue of 2 different paths for PREEMPT_RT versus otherwise,
complicating the tracing side so there better be a reason for that I guess.

thanks,

- Joel