Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based broadcasting

From: Nicolas Pitre
Date: Thu Feb 19 2015 - 12:51:59 EST

Next message: Anshul Garg: "Re: Fw: [PATCH] lib/kstrtox.c Stop parsing integer on overflow"
Previous message: John Stultz: "Re: [PATCH 0/2] perf/x86: Add ability to sample TSC"
In reply to: Peter Zijlstra: "Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based broadcasting"
Next in thread: Peter Zijlstra: "Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based broadcasting"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, 16 Feb 2015, Peter Zijlstra wrote:

> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>
> Preeti reported a cpu down race with hrtimer based broadcasting:
>
> Assume CPU1 is the CPU which holds the hrtimer broadcasting duty
> before it is taken down.
>
> CPU0 CPU1
> cpu_down()
> takedown_cpu()
> disable_interrupts()
> cpu_die()
> while (CPU1 != DEAD) {
> msleep(100);
> switch_to_idle()
> stop_cpu_timer()
> schedule_broadcast()
> }
>
> tick_cleanup_dead_cpu()
> take_over_broadcast()
>
> So after CPU1 disabled interrupts it cannot handle the broadcast
> hrtimer anymore, so CPU0 will be stuck forever.
>
> Doing a "while (CPU1 != DEAD) msleep(100);" periodic poll is silly at
> best, but we need to fix that nevertheless.
>
> Split the tick cleanup into two pieces:
>
> 1) Shutdown and remove all per cpu clockevent devices from
> takedown_cpu()
>
> This is done carefully with respect to existing arch code which
> works around the shortcoming of the clockevents core code in
> interesting ways. We really want a separate callback for this to
> cleanup the workarounds, but that's not scope of this patch
>
> 2) Takeover the broadcast duty explicitely before calling cpu_die()
>
> This is a temporary workaround as well. What we really want is a
> callback in the clockevent device which allows us to do that from
> the dying CPU by pushing the hrtimer onto a different cpu. That
> might involve an IPI and is definitely more complex than this
> immediate fix.
>
> Reported-by: Preeti U Murthy <preeti@xxxxxxxxxxxxxxxxxx>
> Signed-off-by: Thomas Gleixner <tglx@xxxxxxxxxxxxx>

This breaks the b.L switcher disabling code which essentially does:

static void bL_switcher_restore_cpus(void)
{
int i;

for_each_cpu(i, &bL_switcher_removed_logical_cpus) {
struct device *cpu_dev = get_cpu_device(i);
int ret = device_online(cpu_dev);
if (ret)
dev_err(cpu_dev, "switcher: unable to restore CPU\n");
}
}

However, as soon as one new CPU becomes online, the following crash
occurs on that CPU:

[ 547.858031] ------------[ cut here ]------------
[ 547.871868] kernel BUG at kernel/time/hrtimer.c:1249!
[ 547.886991] Internal error: Oops - BUG: 0 [#1] SMP THUMB2
[ 547.903155] Modules linked in:
[ 547.912303] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 3.19.0-rc5-00058-gdd7a65fbc5 #527
[...]
[ 548.599060] [<c005a1c2>] (hrtimer_interrupt) from [<c00639db>] (tick_do_broadcast.constprop.8+0x8f/0x90)
[ 548.627482] [<c00639db>] (tick_do_broadcast.constprop.8) from [<c0063acd>] (tick_handle_oneshot_broadcast+0xf1/0x168)
[ 548.659290] [<c0063acd>] (tick_handle_oneshot_broadcast) from [<c001a07f>] (sp804_timer_interrupt+0x2b/0x30)
[ 548.688755] [<c001a07f>] (sp804_timer_interrupt) from [<c004ee9b>] (handle_irq_event_percpu+0x37/0x130)
[ 548.716916] [<c004ee9b>] (handle_irq_event_percpu) from [<c004efc7>] (handle_irq_event+0x33/0x48)
[ 548.743511] [<c004efc7>] (handle_irq_event) from [<c0050c1d>] (handle_fasteoi_irq+0x69/0xe4)
[ 548.768804] [<c0050c1d>] (handle_fasteoi_irq) from [<c004e835>] (generic_handle_irq+0x1d/0x28)
[ 548.794619] [<c004e835>] (generic_handle_irq) from [<c004ea17>] (__handle_domain_irq+0x3f/0x80)
[ 548.820694] [<c004ea17>] (__handle_domain_irq) from [<c00084f5>] (gic_handle_irq+0x21/0x4c)
[ 548.845729] [<c00084f5>] (gic_handle_irq) from [<c04521db>] (__irq_svc+0x3b/0x5c)

The corresponding code is:

void hrtimer_interrupt(struct clock_event_device *dev)
{
struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
ktime_t expires_next, now, entry_time, delta;
int i, retries = 0;

BUG_ON(!cpu_base->hres_active);
[...]

Reverting this patch "fixes" the problem.

Nicolas
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Anshul Garg: "Re: Fw: [PATCH] lib/kstrtox.c Stop parsing integer on overflow"
Previous message: John Stultz: "Re: [PATCH 0/2] perf/x86: Add ability to sample TSC"
In reply to: Peter Zijlstra: "Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based broadcasting"
Next in thread: Peter Zijlstra: "Re: [PATCH 32/35] clockevents: Fix cpu down race for hrtimer based broadcasting"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]