Re: [PATCH v4 2/2] CPU hotplug, stop-machine: Plug race-window that leads to "IPI-to-offline-CPU"

From: Frederic Weisbecker
Date: Tue May 13 2014 - 11:57:37 EST


On Tue, May 13, 2014 at 02:32:00PM +0530, Srivatsa S. Bhat wrote:
>
> kernel/stop_machine.c | 39 ++++++++++++++++++++++++++++++++++-----
> 1 file changed, 34 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c
> index 01fbae5..288f7fe 100644
> --- a/kernel/stop_machine.c
> +++ b/kernel/stop_machine.c
> @@ -130,8 +130,10 @@ enum multi_stop_state {
> MULTI_STOP_NONE,
> /* Awaiting everyone to be scheduled. */
> MULTI_STOP_PREPARE,
> - /* Disable interrupts. */
> - MULTI_STOP_DISABLE_IRQ,
> + /* Disable interrupts on CPUs not in ->active_cpus mask. */
> + MULTI_STOP_DISABLE_IRQ_INACTIVE,
> + /* Disable interrupts on CPUs in ->active_cpus mask. */
> + MULTI_STOP_DISABLE_IRQ_ACTIVE,
> /* Run the function */
> MULTI_STOP_RUN,
> /* Exit */
> @@ -189,12 +191,39 @@ static int multi_cpu_stop(void *data)
> do {
> /* Chill out and ensure we re-read multi_stop_state. */
> cpu_relax();
> +
> + /*
> + * We use 2 separate stages to disable interrupts, namely
> + * _INACTIVE and _ACTIVE, to ensure that the inactive CPUs
> + * disable their interrupts first, followed by the active CPUs.
> + *
> + * This is done to avoid a race in the CPU offline path, which
> + * can lead to receiving IPIs on the outgoing CPU *after* it
> + * has gone offline.
> + *
> + * During CPU offline, we don't want the other CPUs to send
> + * IPIs to the active_cpu (the outgoing CPU) *after* it has
> + * disabled interrupts (because, then it will notice the IPIs
> + * only after it has gone offline). We can prevent this by
> + * making the other CPUs disable their interrupts first - that
> + * way, they will run the stop-machine code with interrupts
> + * disabled, and hence won't send IPIs after that point.
> + */
> +
> if (msdata->state != curstate) {
> curstate = msdata->state;
> switch (curstate) {
> - case MULTI_STOP_DISABLE_IRQ:
> - local_irq_disable();
> - hard_irq_disable();
> + case MULTI_STOP_DISABLE_IRQ_INACTIVE:
> + if (!is_active) {
> + local_irq_disable();
> + hard_irq_disable();
> + }
> + break;
> + case MULTI_STOP_DISABLE_IRQ_ACTIVE:
> + if (is_active) {
> + local_irq_disable();
> + hard_irq_disable();

I have no idea about possible IPI latencies due to hardware. But are we sure that a stop
machine transition state is enough to make sure we get a pending IPI? Shouldn't we have
some sort of IPI flush in between, like polling on call_single_queue?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/