Re: [PATCH] tick, broadcast: Prevent false alarm when force mask contains offline cpus

From: Srivatsa S. Bhat
Date: Wed Mar 26 2014 - 07:26:38 EST


On 03/26/2014 09:26 AM, Preeti U Murthy wrote:
> Its possible that the tick_broadcast_force_mask contains cpus which are not
> in cpu_online_mask when a broadcast tick occurs. This could happen under the
> following circumstance assuming CPU1 is among the CPUs waiting for broadcast.
>
> CPU0 CPU1
>
> Run CPU_DOWN_PREPARE notifiers
>
> Start stop_machine Gets woken up by IPI to run
> stop_machine, sets itself in
> tick_broadcast_force_mask if the
> time of broadcast interrupt is around
> the same time as this IPI.
>
> Start stop_machine
> set_cpu_online(cpu1, false)
> End stop_machine End stop_machine
>
> Broadcast interrupt
> Finds that cpu1 in
> tick_broadcast_force_mask is offline
> and triggers the WARN_ON in
> tick_handle_oneshot_broadcast()
>
> Clears all broadcast masks
> in CPU_DEAD stage.
>
> This WARN_ON was added to capture scenarios where the broadcast mask, be it
> oneshot/pending/force_mask contain offline cpus whose tick devices have been
> removed. But here is a case where we trigger the warn on in a valid scenario.
>
> One could argue that the scenario is invalid and ought to be warned against
> because ideally the broadcast masks need to be cleared of the cpus about to
> go offine before clearing them in the online_mask so that we dont hit these
> scenarios.
>
> This would mean clearing the masks in CPU_DOWN_PREPARE stage.

Not necessarily. We could clear the mask in the CPU_DYING stage. That way,
offline CPUs will automatically get cleared from the force_mask and hence
the tick-broadcast code will not need to have a special case to deal with
this scenario. What do you think?

Regards,
Srivatsa S. Bhat

> ---
>
> kernel/time/tick-broadcast.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> index 63c7b2d..30b8731 100644
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -606,7 +606,12 @@ again:
> */
> cpumask_clear_cpu(smp_processor_id(), tick_broadcast_pending_mask);
>
> - /* Take care of enforced broadcast requests */
> + /* Take care of enforced broadcast requests. We could have offline
> + * cpus in the tick_broadcast_force_mask. Thats ok, we got the interrupt
> + * before we could clear the mask.
> + */
> + cpumask_and(tick_broadcast_force_mask,
> + tick_broadcast_force_mask, cpu_online_mask);
> cpumask_or(tmpmask, tmpmask, tick_broadcast_force_mask);
> cpumask_clear(tick_broadcast_force_mask);
>
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/