[PATCH v2] softirq: Call trace_preempt_on manually to prevent lockdep splatter

From: Daniel Wagner
Date: Thu Nov 26 2015 - 09:00:02 EST


__do_softirq() uses __local_bh_disable_ip(SOFTIRQ_OFFSET) to disable
SOFTIRQs and signaling that softirqs are served.

When done, __do_softirq() calls __local_bh_enable(SOFTIRQ_OFFSET)
to enable them again. The reason it does not call
__local_bh_enable_ip(SOFTIRQ_OFFSET) is because __local_bh_enable_ip()
would process still-pending softirqs.

Now, we face the same problem as in __local_bh_disable_ip()
with the preempt tracer hooking into preempt_count_sub(). This
will case lockdep warning, when for example the kernel is configured
to run all IRQs as threads and preemptoff tracer is activated:

[ 42.209684] WARNING: CPU: 2 PID: 16 at kernel/locking/lockdep.c:3574 check_flags.part.36+0x1bc/0x210()
[ 42.210053] DEBUG_LOCKS_WARN_ON(current->softirqs_enabled)
[ 42.210053] Kernel panic - not syncing: panic_on_warn set ...
[ 42.210053]
[ 42.210053] CPU: 2 PID: 16 Comm: ksoftirqd/2 Not tainted 4.1.0-rc4 #666
[ 42.210053] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140709_153950- 04/01/2014
[ 42.210053] ffff88007ca47ab8 ffff88007ca47ab8 ffffffff81b5a06a 0000000000000007
[ 42.210053] ffffffff81ef08b8 ffff88007ca47b38 ffffffff81b567ee 0000000000000102
[ 42.210053] ffffffff00000008 ffff88007ca47b48 ffff88007ca47ae8 ffffffff81b56733
[ 42.210053] Call Trace:
[ 42.210053] [<ffffffff81b5a06a>] dump_stack+0x4f/0x7b
[ 42.210053] [<ffffffff81b567ee>] panic+0xc0/0x1e9
[ 42.210053] [<ffffffff81b56733>] ? panic+0x5/0x1e9
[ 42.210053] [<ffffffff81b66608>] ? _raw_spin_unlock_irqrestore+0x38/0x80
[ 42.210053] [<ffffffff81057fe0>] warn_slowpath_common+0xc0/0xc0
[ 42.210053] [<ffffffff81058026>] warn_slowpath_fmt+0x46/0x50
[ 42.210053] [<ffffffff81057fe5>] ? warn_slowpath_fmt+0x5/0x50
[ 42.210053] [<ffffffff810ac90c>] check_flags.part.36+0x1bc/0x210
[ 42.210053] [<ffffffff810ad5a9>] lock_acquire+0x119/0x2b0
[ 42.210053] [<ffffffff81b663ae>] ? _raw_spin_lock_irqsave+0x1e/0x90
[ 42.210053] [<ffffffff810886a5>] ? preempt_count_add+0x5/0xc0
[ 42.210053] [<ffffffff81b663dc>] _raw_spin_lock_irqsave+0x4c/0x90
[ 42.210053] [<ffffffff8113f1b6>] ? check_critical_timing+0xa6/0x160
[ 42.210053] [<ffffffff81b66395>] ? _raw_spin_lock_irqsave+0x5/0x90
[ 42.210053] [<ffffffff8113f1b6>] check_critical_timing+0xa6/0x160
[ 42.210053] [<ffffffff8105e40b>] ? __do_softirq+0x3fb/0x670
[ 42.210053] [<ffffffff8105c7f6>] ? __local_bh_enable+0x36/0x70
[ 42.210053] [<ffffffff8105e40b>] ? __do_softirq+0x3fb/0x670
[ 42.210053] [<ffffffff8105c7f6>] ? __local_bh_enable+0x36/0x70
[ 42.210053] [<ffffffff8113fde1>] trace_preempt_on+0xf1/0x110
[ 42.210053] [<ffffffff81088765>] ? preempt_count_sub+0x5/0xf0
[ 42.210053] [<ffffffff8108880b>] preempt_count_sub+0xab/0xf0
[ 42.210053] [<ffffffff8105c7f6>] ? __local_bh_enable+0x36/0x70
[ 42.210053] [<ffffffff8105c7f6>] __local_bh_enable+0x36/0x70
[ 42.210053] [<ffffffff8105e40b>] ? __do_softirq+0x3fb/0x670
[ 42.210053] [<ffffffff8105e40b>] __do_softirq+0x3fb/0x670
[ 42.210053] [<ffffffff8105e69f>] run_ksoftirqd+0x1f/0x60
[ 42.210053] [<ffffffff8107f8f3>] smpboot_thread_fn+0x193/0x2a0
[ 42.210053] [<ffffffff8107f760>] ? sort_range+0x30/0x30
[ 42.210053] [<ffffffff8107bac2>] kthread+0xf2/0x110
[ 42.210053] [<ffffffff81b61ac3>] ? wait_for_completion+0xc3/0x120
[ 42.210053] [<ffffffff8108880b>] ? preempt_count_sub+0xab/0xf0
[ 42.210053] [<ffffffff8107b9d0>] ? kthread_create_on_node+0x240/0x240
[ 42.210053] [<ffffffff81b674c2>] ret_from_fork+0x42/0x70
[ 42.210053] [<ffffffff8107b9d0>] ? kthread_create_on_node+0x240/0x240
[ 42.210053] Kernel Offset: disabled
[ 42.210053] ---[ end Kernel panic - not syncing: panic_on_warn set ...

To mitigate this, call the preempt_tracer manually and then
update the preempt_count.

While at we are at it also update the comment on _local_bh_enable.
cond_resched_softirq() used to call _local_bh_enable(), now it uses
local_bh_enable(). This was changed by:

commit 98d8256739f2c6c636fa2da359f5949c739ae839
Author: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Date: Wed May 23 13:58:18 2007 -0700

Prevent going idle with softirq pending

Signed-off-by: Daniel Wagner <daniel.wagner@xxxxxxxxxxxx>
Acked-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Steven Rostedt <rostedt@xxxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
---
Hi,

This patch has not found its way into the kernel yet. Hence a rebased
to mainline and resend.

cheers
daniel

kernel/softirq.c | 22 ++++++++++++++++++----
1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 479e443..9ec59f2 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -124,19 +124,33 @@ void __local_bh_disable_ip(unsigned long ip, unsigned int cnt)
EXPORT_SYMBOL(__local_bh_disable_ip);
#endif /* CONFIG_TRACE_IRQFLAGS */

+/*
+ * Special-case - softirqs can safely be enabled in
+ * __do_softirq(), without processing still-pending softirqs.
+ */
static void __local_bh_enable(unsigned int cnt)
{
WARN_ON_ONCE(!irqs_disabled());

+ /*
+ * The preempt tracer hooks into preempt_count_sub and will break
+ * lockdep because it calls back into lockdep before SOFTIRQ_OFFSET
+ * is cleared while current->softirq_enabled is set.
+ * We must call the trace_preempt_off now, and decrement preempt_count
+ * afterwards manually.
+ */
+ if (preempt_count() == cnt)
+ trace_preempt_on(CALLER_ADDR0, get_parent_ip(CALLER_ADDR1));
+
if (softirq_count() == (cnt & SOFTIRQ_MASK))
trace_softirqs_on(_RET_IP_);
- preempt_count_sub(cnt);
+
+ __preempt_count_sub(cnt);
}

/*
- * Special-case - softirqs can safely be enabled in
- * cond_resched_softirq(), or by __do_softirq(),
- * without processing still-pending softirqs:
+ * Special-case - enable softirqs without processing
+ * still pending softirqs from IRQ context.
*/
void _local_bh_enable(void)
{
--
2.4.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/