[PATCH RT 3/5] lockdep: Correctly annotate hardirq context in irq_exit()

From: Steven Rostedt
Date: Tue Dec 10 2013 - 19:52:16 EST


3.4.72-rt90-rc1 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Peter Zijlstra <peterz@xxxxxxxxxxxxx>

There was a reported deadlock on -rt which lockdep didn't report.

It turns out that in irq_exit() we tell lockdep that the hardirq
context ends and then do all kinds of locking afterwards.

To fix it, move trace_hardirq_exit() to the very end of irq_exit(), this
ensures all locking in tick_irq_exit() and rcu_irq_exit() are properly
recorded as happening from hardirq context.

This however leads to the 'fun' little problem of running softirqs
while in hardirq context. To cure this make the softirq code a little
more complex (in the CONFIG_TRACE_IRQFLAGS case).

Due to stack swizzling arch dependent trickery we cannot pass an
argument to __do_softirq() to tell it if it was done from hardirq
context or not; so use a side-band argument.

When we do __do_softirq() from hardirq context, 'atomically' flip to
softirq context and back, so that no locking goes without being in
either hard- or soft-irq context.

I didn't find any new problems in mainline using this patch, but it
did show the -rt problem.

Cc: stable-rt@xxxxxxxxxxxxxxx
Reported-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
Link: http://lkml.kernel.org/n/tip-dgwc5cdksbn0jk09vbmcc9sa@xxxxxxxxxxxxxx
Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@xxxxxxxxxxxxx>
---
kernel/softirq.c | 45 ++++++++++++++++++++++++++++++++++++++++++---
1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/kernel/softirq.c b/kernel/softirq.c
index 34fe1db..c2c6bc8 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -297,6 +297,44 @@ EXPORT_SYMBOL(local_bh_enable_ip);
*/
#define MAX_SOFTIRQ_RESTART 10

+#ifdef CONFIG_TRACE_IRQFLAGS
+/*
+ * Convoluted means of passing __do_softirq() a message through the various
+ * architecture execute_on_stack() bits.
+ *
+ * When we run softirqs from irq_exit() and thus on the hardirq stack we need
+ * to keep the lockdep irq context tracking as tight as possible in order to
+ * not miss-qualify lock contexts and miss possible deadlocks.
+ */
+static DEFINE_PER_CPU(int, softirq_from_hardirq);
+
+static inline void lockdep_softirq_from_hardirq(void)
+{
+ this_cpu_write(softirq_from_hardirq, 1);
+}
+
+static inline void lockdep_softirq_start(void)
+{
+ if (this_cpu_read(softirq_from_hardirq))
+ trace_hardirq_exit();
+ lockdep_softirq_enter();
+}
+
+static inline void lockdep_softirq_end(void)
+{
+ lockdep_softirq_exit();
+ if (this_cpu_read(softirq_from_hardirq)) {
+ this_cpu_write(softirq_from_hardirq, 0);
+ trace_hardirq_enter();
+ }
+}
+
+#else
+static inline void lockdep_softirq_from_hardirq(void) { }
+static inline void lockdep_softirq_start(void) { }
+static inline void lockdep_softirq_end(void) { }
+#endif
+
asmlinkage void __do_softirq(void)
{
__u32 pending;
@@ -308,7 +346,7 @@ asmlinkage void __do_softirq(void)

__local_bh_disable((unsigned long)__builtin_return_address(0),
SOFTIRQ_OFFSET);
- lockdep_softirq_enter();
+ lockdep_softirq_start();

cpu = smp_processor_id();
restart:
@@ -324,7 +362,7 @@ restart:
if (pending)
wakeup_softirqd();

- lockdep_softirq_exit();
+ lockdep_softirq_end();

account_system_vtime(current);
__local_bh_enable(SOFTIRQ_OFFSET);
@@ -582,6 +620,7 @@ static inline void invoke_softirq(void)
{
#ifndef CONFIG_PREEMPT_RT_FULL
if (!force_irqthreads) {
+ lockdep_softirq_from_hardirq();
#ifdef __ARCH_IRQ_EXIT_IRQS_DISABLED
__do_softirq();
#else
@@ -604,7 +643,6 @@ static inline void invoke_softirq(void)
void irq_exit(void)
{
account_system_vtime(current);
- trace_hardirq_exit();
sub_preempt_count(IRQ_EXIT_OFFSET);
if (!in_interrupt() && local_softirq_pending())
invoke_softirq();
@@ -615,6 +653,7 @@ void irq_exit(void)
tick_nohz_irq_exit();
#endif
rcu_irq_exit();
+ trace_hardirq_exit(); /* must be last! */
sched_preempt_enable_no_resched();
}

--
1.8.4.3


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/