Re: [RFC PATCH 6/6] tracing: use sched-RCU instead of SRCU for rcuidle tracepoints
From: Mathieu Desnoyers
Date: Tue Oct 27 2020 - 09:57:53 EST
----- On Oct 26, 2020, at 4:44 PM, rostedt rostedt@xxxxxxxxxxx wrote:
> On Mon, 26 Oct 2020 10:28:07 -0400 (EDT)
> Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
>
>> I agree with Peter. Removing the trace_.*_rcuidle weirdness from the tracepoint
>> API and fixing all callers to ensure they trace from a context where RCU is
>> watching would simplify instrumentation of the Linux kernel, thus making it
>> harder
>> for subtle bugs to hide and be unearthed only when tracing is enabled. This is
>
> Note, the lockdep RCU checking of a tracepoint is outside of it being
> enabled or disable. So if a non rcuidle() tracepoint is in a location that
> RCU is not watching, it will complain loudly, even if you don't enable that
> tracepoint.
>
>> AFAIU the general approach Thomas Gleixner has been aiming for recently, and I
>> think it is a good thing.
>>
>> So if we consider this our target, and that the current state of things is that
>> we need to have RCU watching around callback invocation, then removing the
>> dependency on SRCU seems like an overall simplification which does not regress
>> feature-wise nor speed-wise compared with what we have upstream today. The next
>> steps would then be to audit all rcuidle tracepoints and make sure the context
>> where they are placed has RCU watching already, so we can remove the tracepoint
>
> Just remove the _rcuidle() from them, and lockdep will complain if they are
> being called without RCU watching.
That's the easy part. The patch removing rcuidle is attached to this email,
and here are the splats I get just when booting the machine (see below). The
part which takes more time is fixing those splats and figuring out how to
restructure the code. For instance, what should we do about the rcuidle
tracepoint within printk() ?
Also, how do we test rcuidle tracepoints triggered only when printk is called
from printk warnings ? We'd also need to test on arm32 boards, because some arm
architecture code uses rcuidle tracepoints as well.
As this is beyond the scope of this patch set, I can either drop this patch
entirely (it's not required for sleepable tracepoints), or keep it as an
intermediate cleanup step. Removing rcuidle tracepoints entirely is beyond
the effort Michael and I can undertake now.
=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
=============================
./include/trace/events/preemptirq.h:42 suspicious rcu_dereference_check() usage!
WARNING: suspicious RCU usage
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 1
5.9.1+ #72 Not tainted
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.
-----------------------------
./include/trace/events/preemptirq.h:38 suspicious rcu_dereference_check() usage!
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
RCU used illegally from extended quiescent state!
no locks held by swapper/1/0.
dump_stack+0x8d/0xbb
stack backtrace:
? default_idle+0xa/0x20
trace_hardirqs_on+0xed/0xf0
default_idle+0xa/0x20
default_idle_call+0x4f/0x1e0
do_idle+0x1ef/0x2c0
cpu_startup_entry+0x19/0x20
start_kernel+0x578/0x59d
secondary_startup_64+0xa4/0xb0
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.9.1+ #72
=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
./include/trace/events/lock.h:37 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
no locks held by swapper/0/0.
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x8d/0xbb
lock_acquire+0x346/0x3b0
? __lock_acquire+0x2e7/0x1da0
_raw_spin_lock+0x2c/0x40
? vprintk_emit+0x9f/0x410
vprintk_emit+0x9f/0x410
printk+0x52/0x6e
lockdep_rcu_suspicious+0x1b/0xe0
? default_idle+0xa/0x20
trace_hardirqs_on+0xed/0xf0
default_idle+0xa/0x20
default_idle_call+0x4f/0x1e0
do_idle+0x1ef/0x2c0
cpu_startup_entry+0x19/0x20
start_kernel+0x578/0x59d
secondary_startup_64+0xa4/0xb0
=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
./include/trace/events/lock.h:63 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
1 lock held by swapper/0/0:
#0: ffffffff97a80158 (logbuf_lock){-...}-{2:2}, at: vprintk_emit+0x9f/0x410
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x8d/0xbb
lock_release+0x25a/0x280
_raw_spin_unlock+0x17/0x30
vprintk_emit+0xdf/0x410
printk+0x52/0x6e
lockdep_rcu_suspicious+0x1b/0xe0
? default_idle+0xa/0x20
trace_hardirqs_on+0xed/0xf0
default_idle+0xa/0x20
default_idle_call+0x4f/0x1e0
do_idle+0x1ef/0x2c0
cpu_startup_entry+0x19/0x20
start_kernel+0x578/0x59d
secondary_startup_64+0xa4/0xb0
=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
./include/trace/events/printk.h:33 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
2 locks held by swapper/0/0:
#0: ffffffff97a801a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x126/0x410
#1: ffffffff97a7fec0 (console_owner){....}-{0:0}, at: console_unlock+0x190/0x5d0
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x8d/0xbb
console_unlock+0x5ad/0x5d0
vprintk_emit+0x133/0x410
printk+0x52/0x6e
lockdep_rcu_suspicious+0x1b/0xe0
? default_idle+0xa/0x20
trace_hardirqs_on+0xed/0xf0
default_idle+0xa/0x20
default_idle_call+0x4f/0x1e0
do_idle+0x1ef/0x2c0
cpu_startup_entry+0x19/0x20
start_kernel+0x578/0x59d
secondary_startup_64+0xa4/0xb0
=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
./include/linux/rcupdate.h:636 rcu_read_lock() used illegally while idle!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
4 locks held by swapper/0/0:
#0: ffffffff97a801a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x126/0x410
#1: ffffffff97a7fec0 (console_owner){....}-{0:0}, at: console_unlock+0x190/0x5d0
#2: ffffffff97b7d598 (printing_lock){....}-{2:2}, at: vt_console_print+0x7d/0x3e0
#3: ffffffff97a82d80 (rcu_read_lock){....}-{1:2}, at: __atomic_notifier_call_chain+0x5/0x110
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x8d/0xbb
__atomic_notifier_call_chain+0xd7/0x110
vt_console_print+0x19e/0x3e0
console_unlock+0x417/0x5d0
vprintk_emit+0x133/0x410
printk+0x52/0x6e
lockdep_rcu_suspicious+0x1b/0xe0
? default_idle+0xa/0x20
trace_hardirqs_on+0xed/0xf0
default_idle+0xa/0x20
default_idle_call+0x4f/0x1e0
do_idle+0x1ef/0x2c0
cpu_startup_entry+0x19/0x20
start_kernel+0x578/0x59d
secondary_startup_64+0xa4/0xb0
=============================
WARNING: suspicious RCU usage
5.9.1+ #72 Not tainted
-----------------------------
./include/linux/rcupdate.h:685 rcu_read_unlock() used illegally while idle!
other info that might help us debug this:
rcu_scheduler_active = 1, debug_locks = 1
RCU used illegally from extended quiescent state!
4 locks held by swapper/0/0:
#0: ffffffff97a801a0 (console_lock){+.+.}-{0:0}, at: vprintk_emit+0x126/0x410
#1: ffffffff97a7fec0 (console_owner){....}-{0:0}, at: console_unlock+0x190/0x5d0
#2: ffffffff97b7d598 (printing_lock){....}-{2:2}, at: vt_console_print+0x7d/0x3e0
#3: ffffffff97a82d80 (rcu_read_lock){....}-{1:2}, at: __atomic_notifier_call_chain+0x5/0x110
stack backtrace:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.9.1+ #72
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x8d/0xbb
__atomic_notifier_call_chain+0x101/0x110
vt_console_print+0x19e/0x3e0
console_unlock+0x417/0x5d0
vprintk_emit+0x133/0x410
printk+0x52/0x6e
lockdep_rcu_suspicious+0x1b/0xe0
? default_idle+0xa/0x20
trace_hardirqs_on+0xed/0xf0
default_idle+0xa/0x20
default_idle_call+0x4f/0x1e0
do_idle+0x1ef/0x2c0
cpu_startup_entry+0x19/0x20
start_kernel+0x578/0x59d
secondary_startup_64+0xa4/0xb0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
dump_stack+0x8d/0xbb
? rcu_idle_exit+0x32/0x40
trace_hardirqs_off+0xe3/0xf0
rcu_idle_exit+0x32/0x40
default_idle_call+0x54/0x1e0
do_idle+0x1ef/0x2c0
? lockdep_hardirqs_on_prepare+0xec/0x180
cpu_startup_entry+0x19/0x20
start_secondary+0x11c/0x160
secondary_startup_64+0xa4/0xb0
Thanks,
Mathieu
>
> -- Steve
>
>
>> rcuidle API. That would effectively remove the calls to
>> rcu_irq_{enter,exit}_irqson
>> from the tracepoint code.
>>
>> This is however beyond the scope of the proposed patch set.
>>
>> Thanks,
>>
>> Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com
Attachment:
0001-wip-remove-rcuidle-tracepoint-API.patch
Description: application/mbox