[tip:timers/urgent] tick/broadcast: Prevent deadlock on tick_broadcast_lock

From: tip-bot for Mike Galbraith
Date: Mon Feb 13 2017 - 03:53:11 EST


Commit-ID: 202461e2f3c15dbfb05825d29ace0d20cdf55fa4
Gitweb: http://git.kernel.org/tip/202461e2f3c15dbfb05825d29ace0d20cdf55fa4
Author: Mike Galbraith <efault@xxxxxx>
AuthorDate: Mon, 13 Feb 2017 03:31:55 +0100
Committer: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
CommitDate: Mon, 13 Feb 2017 09:49:31 +0100

tick/broadcast: Prevent deadlock on tick_broadcast_lock

tick_broadcast_lock is taken from interrupt context, but the following call
chain takes the lock without disabling interrupts:

[ 12.703736] _raw_spin_lock+0x3b/0x50
[ 12.703738] tick_broadcast_control+0x5a/0x1a0
[ 12.703742] intel_idle_cpu_online+0x22/0x100
[ 12.703744] cpuhp_invoke_callback+0x245/0x9d0
[ 12.703752] cpuhp_thread_fun+0x52/0x110
[ 12.703754] smpboot_thread_fn+0x276/0x320

So the following deadlock can happen:

lock(tick_broadcast_lock);
<Interrupt>
lock(tick_broadcast_lock);

intel_idle_cpu_online() is the only place which violates the calling
convention of tick_broadcast_control(). This was caused by the removal of
the smp function call in course of the cpu hotplug rework.

Instead of slapping local_irq_disable/enable() at the call site, we can
relax the calling convention and handle it in the core code, which makes
the whole machinery more robust.

Fixes: 29d7bbada98e ("intel_idle: Remove superfluous SMP fuction call")
Reported-by: Gabriel C <nix.or.die@xxxxxxxxx>
Signed-off-by: Mike Galbraith <efault@xxxxxx>
Cc: Ruslan Ruslichenko <rruslich@xxxxxxxxx>
Cc: Jiri Slaby <jslaby@xxxxxxx>
Cc: Greg KH <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxxxx>
Cc: lwn@xxxxxxx
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Cc: Anna-Maria Gleixner <anna-maria@xxxxxxxxxxxxx>
Cc: Sebastian Siewior <bigeasy@xxxxxxxxxxxxx>
Cc: stable <stable@xxxxxxxxxxxxxxx>
Link: http://lkml.kernel.org/r/1486953115.5912.4.camel@xxxxxx
Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

---
kernel/time/tick-broadcast.c | 15 +++++++--------
1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
index 3109204..17ac99b 100644
--- a/kernel/time/tick-broadcast.c
+++ b/kernel/time/tick-broadcast.c
@@ -347,17 +347,16 @@ static void tick_handle_periodic_broadcast(struct clock_event_device *dev)
*
* Called when the system enters a state where affected tick devices
* might stop. Note: TICK_BROADCAST_FORCE cannot be undone.
- *
- * Called with interrupts disabled, so clockevents_lock is not
- * required here because the local clock event device cannot go away
- * under us.
*/
void tick_broadcast_control(enum tick_broadcast_mode mode)
{
struct clock_event_device *bc, *dev;
struct tick_device *td;
int cpu, bc_stopped;
+ unsigned long flags;

+ /* Protects also the local clockevent device. */
+ raw_spin_lock_irqsave(&tick_broadcast_lock, flags);
td = this_cpu_ptr(&tick_cpu_device);
dev = td->evtdev;

@@ -365,12 +364,11 @@ void tick_broadcast_control(enum tick_broadcast_mode mode)
* Is the device not affected by the powerstate ?
*/
if (!dev || !(dev->features & CLOCK_EVT_FEAT_C3STOP))
- return;
+ goto out;

if (!tick_device_is_functional(dev))
- return;
+ goto out;

- raw_spin_lock(&tick_broadcast_lock);
cpu = smp_processor_id();
bc = tick_broadcast_device.evtdev;
bc_stopped = cpumask_empty(tick_broadcast_mask);
@@ -420,7 +418,8 @@ void tick_broadcast_control(enum tick_broadcast_mode mode)
tick_broadcast_setup_oneshot(bc);
}
}
- raw_spin_unlock(&tick_broadcast_lock);
+out:
+ raw_spin_unlock_irqrestore(&tick_broadcast_lock, flags);
}
EXPORT_SYMBOL_GPL(tick_broadcast_control);