[RFC PATCH] hrtimer: remove deadlock due to waiting on IPI in softirq context

From: Rik van Riel
Date: Wed Mar 05 2014 - 16:27:16 EST


There appears to be a deadlock in the hrtimer code. Specifically,
clock_was_set() calls an IPI with wait=1, from softirq context.

Waiting for IPIs to complete in irq context can lead to a deadlock,
because the current code (that was interrupted) might be holding some
kind of lock, that another CPU is waiting for with spin_lock_irq or
similar.

In other words, the current CPU may need to release a resource, before
the IPI can be handled by one of the destination CPUs.

To my untrained eye, it does not look like this patch introduces a
new bug to the timer code, but that is hard to ascertain with the
timer code. so I am posting this as an RFC for the timer gods to hurt
their brains on :)

This bug was introduced by 54cdfdb4 in early 2007 (the original
hrtimer code patch).

Not-yet-signed-off-by: Rik van Riel <riel@xxxxxxxxxx>
Reported-by: Mateusz Guzik <mguzik@xxxxxxxxxx>
Cc: Benjamin Herrenschmidt <benh@xxxxxxxxxxxxxxxxxxx>
Cc: Ingo Molnar <mingo@xxxxxxxxxx>
Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
Cc: Prarit Bhargava <prarit@xxxxxxxxxx>
Cc: Frederic Weisbecker <fweisbec@xxxxxxxxx>
Cc: Clark Williams <williams@xxxxxxxxxx>
---
kernel/hrtimer.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 0909436..19145ec 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -771,7 +771,7 @@ void clock_was_set(void)
{
#ifdef CONFIG_HIGH_RES_TIMERS
/* Retrigger the CPU local events everywhere */
- on_each_cpu(retrigger_next_event, NULL, 1);
+ on_each_cpu(retrigger_next_event, NULL, 0);
#endif
timerfd_clock_was_set();
}
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/