Re: [PATCH V2] clockevents: Fix cpu down race for hrtimer based broadcasting

From: Ingo Molnar
Date: Thu Apr 02 2015 - 06:42:38 EST



* Preeti U Murthy <preeti@xxxxxxxxxxxxxxxxxx> wrote:

> It was found when doing a hotplug stress test on POWER, that the machine
> either hit softlockups or rcu_sched stall warnings. The issue was
> traced to commit 7cba160ad789a powernv/cpuidle: Redesign idle states
> management, which exposed the cpu down race with hrtimer based broadcast
> mode(Commit 5d1638acb9f6(tick: Introduce hrtimer based broadcast). This
> is explained below.
>
> Assume CPU1 is the CPU which holds the hrtimer broadcasting duty before
> it is taken down.
>
> CPU0 CPU1
>
> cpu_down() take_cpu_down()
> disable_interrupts()
>
> cpu_die()
>
> while(CPU1 != CPU_DEAD) {
> msleep(100);
> switch_to_idle();
> stop_cpu_timer();
> schedule_broadcast();
> }
>
> tick_cleanup_cpu_dead()
> take_over_broadcast()
>
> So after CPU1 disabled interrupts it cannot handle the broadcast hrtimer
> anymore, so CPU0 will be stuck forever.
>
> Fix this by explicitly taking over broadcast duty before cpu_die().
> This is a temporary workaround. What we really want is a callback in the
> clockevent device which allows us to do that from the dying CPU by
> pushing the hrtimer onto a different cpu. That might involve an IPI and
> is definitely more complex than this immediate fix.

So why not use a suitable CPU_DOWN* notifier for this, instead of open
coding it all into a random place in the hotplug machinery?

Also, I improved the changelog (attached below), but decided against
applying it until these questions are cleared - please use that for
future versions of this patch.

Thanks,

Ingo

===================>