Re: ftrace startup tests crashing due to missing rcu_synchronize()
From: Sven Schnelle
Date: Wed Feb 16 2022 - 14:24:21 EST
Steven Rostedt <rostedt@xxxxxxxxxxx> writes:
> On Wed, 16 Feb 2022 19:39:03 +0100
> Sven Schnelle <svens@xxxxxxxxxxxxx> wrote:
>> [ 4.460091] Unable to handle kernel pointer dereference in virtual kernel address space
>> [ 4.460375] Failing address: 6b6b6b6b6b6b6000 TEID: 6b6b6b6b6b6b6803
>> [ 4.460458] Fault in home space mode while using kernel ASCE.
>> [ 4.460695] AS:000000008561c007 R3:0000000000000024
>> [ 4.461143] Oops: 0038 ilc:3 [#1] PREEMPT SMP
>> [ 4.461162] Modules linked in:
>> [ 4.461175] CPU: 245 PID: 0 Comm: swapper/245 Not tainted 5.17.0-rc4-00051-gc5d9ae265b10-dirty #4
>> [ 4.461183] Hardware name: IBM 8561 T01 701 (KVM/Linux)
>
> I this a 390?
Yes.
>> Looking at unregister_ftrace_function(), i noticed that
>> ftrace_shutdown() is called with 0 as command. Given that the ftrace
>> function didn't change and ftrace is still enabled, the
>> rcu_synchronize() functions in ftrace_shutdown() are silently skipped.
>> So the caller frees ops already before other CPUs have gone through
>> quiesce, and may therefore use the old (now freed) list entry.
>>
>> To fix this, i wonder whether we should change the code in
>> unregister_ftrace_function() to:
>>
>> @@ -7827,7 +7837,7 @@ int unregister_ftrace_function(struct ftrace_ops *ops)
>> int ret;
>>
>> mutex_lock(&ftrace_lock);
>> - ret = ftrace_shutdown(ops, 0);
>> + ret = ftrace_shutdown(ops, FTRACE_UPDATE_TRACE_FUNC);
>
> No, the ftrace_shutdown() will add that flag if it is needed.
>
>> mutex_unlock(&ftrace_lock);
>>
>> return ret;
>>
>> I haven't checked whether other callsites of unregister_ftrace_function()
>> also need to be adjusted. What do you think about that 'fix'?
>
> But what I'm thinking is, the function is being freed but has yet to be
> removed from the list. Or that a synchronization is missed.
>
> That is, shutdown is called, the item is removed from the list and freed,
> but something got preempted while on the ftrace trampoline, with a
> reference to the item, and then woke up and executed the item that was
> freed.
>
> I'll look into it. Thanks for the report.
With additional debugging i see:
@@ -2967,14 +2974,17 @@ int ftrace_shutdown(struct ftrace_ops *ops, int command)
}
if (!command || !ftrace_enabled) {
+ pr_err("%s: skipping rcu_synchronize(): ops=%pS command=%d ftrace_enabled=%d saved func=%pS ftrace_trace_func=%pS\n",
+ __func__, ops, command, ftrace_enabled, saved_ftrace_func, ftrace_trace_function);
/*
* If these are dynamic or per_cpu ops, they still
* need their data freed. Since, function tracing is
* not currently active, we can just free them
* without synchronizing all CPUs.
*/
[ +0.000011] unregister_ftrace_function: 0x2f6792e00
[ +0.000023] removing ops 00000002f6792e00 trace_selftest_test_dyn_func+0x0/0x18
[ +0.000661] ftrace_shutdown: skipping rcu_synchronize(): ops=0x2f6792e00 command=0 ftrace_enabled=1 saved func=arch_ftrace_ops_list_func+0x0/0x1b0 ft>
[ +0.010032] unregister_ftrace_function: test_probe1+0x0/0x1b0
[ +0.000017] removing ops 000000009d876e40 trace_selftest_test_probe1_func+0x0/0x18
[ +0.000053] ftrace_shutdown: skipping rcu_synchronize(): ops=test_probe1+0x0/0x1b0 command=0 ftrace_enabled=1 saved func=arch_ftrace_ops_list_func+0x>
[ +0.000017] unregister_ftrace_function: test_probe2+0x0/0x1b0
[ +0.000081] removing ops 000000009d876ff0 trace_selftest_test_probe2_func+0x0/0x18
[ +0.000064] ftrace_shutdown: skipping rcu_synchronize(): ops=test_probe2+0x0/0x1b0 command=0 ftrace_enabled=1 saved func=arch_ftrace_ops_list_func+0x>
[ +0.000015] unregister_ftrace_function: test_probe3+0x0/0x1b0
[ +0.000011] removing ops 000000009d8771a0 trace_selftest_test_probe3_func+0x0/0x18
[ +0.000108] unregister_ftrace_function: global_ops+0x0/0x1b0
[ +0.000010] removing ops 000000009d9325b8 trace_selftest_test_global_func+0x0/0x18
[ +0.025759] PASSED
So the rcu_synchronize is definitely skipped. Another thing i was
wondering was whether we need to reset the next pointer in
the to-be-removed entry in remove_ftrace_ops(). But i haven't
investigated that in detail yet.