Re: call_function_many: fix list delete vs add race
From: Peter Zijlstra
Date: Mon Jan 31 2011 - 05:27:01 EST
On Fri, 2011-01-28 at 18:20 -0600, Milton Miller wrote:
> Peter pointed out there was nothing preventing the list_del_rcu in
> smp_call_function_interrupt from running before the list_add_rcu in
> smp_call_function_many. Fix this by not setting refs until we have put
> the entry on the list. We can use the lock acquire and release instead
> of a wmb.
> Reported-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> Signed-off-by: Milton Miller <miltonm@xxxxxxx>
> Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> I tried to force this race with a udelay before the lock & list_add and
> by mixing all 64 online cpus with just 3 random cpus in the mask, but
> was unsuccessful. Still, it seems to be a valid race, and the fix
> is a simple change to the current code.
Yes, I think this will fix it, I think simply putting that assignment
under the lock is sufficient, because then the list removal will
serialize again the list add. But placing it after the list add does
also seem sufficient.
Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Index: common/kernel/smp.c
> --- common.orig/kernel/smp.c 2011-01-28 16:23:15.000000000 -0600
> +++ common/kernel/smp.c 2011-01-28 16:55:01.000000000 -0600
> @@ -491,15 +491,17 @@ void smp_call_function_many(const struct
> cpumask_clear_cpu(this_cpu, data->cpumask);
> - * To ensure the interrupt handler gets an complete view
> - * we order the cpumask and refs writes and order the read
> - * of them in the interrupt handler. In addition we may
> - * only clear our own cpu bit from the mask.
> + * We reuse the call function data without waiting for any grace
> + * period after some other cpu removes it from the global queue.
> + * This means a cpu might find our data block as it is writen.
> + * The interrupt handler waits until it sees refs filled out
> + * while its cpu mask bit is set; here we may only clear our
> + * own cpu mask bit, and must wait to set refs until we are sure
> + * previous writes are complete and we have obtained the lock to
> + * add the element to the queue. We use the acquire and release
> + * of the lock as a wmb() -- acquire prevents write moving up and
> + * release requires old writes are visible.
> - smp_wmb();
> - atomic_set(&data->refs, cpumask_weight(data->cpumask));
> raw_spin_lock_irqsave(&call_function.lock, flags);
> * Place entry at the _HEAD_ of the list, so that any cpu still
> @@ -509,6 +511,8 @@ void smp_call_function_many(const struct
> list_add_rcu(&data->csd.list, &call_function.queue);
> raw_spin_unlock_irqrestore(&call_function.lock, flags);
> + atomic_set(&data->refs, cpumask_weight(data->cpumask));
> * Make the list addition visible before sending the ipi.
> * (IPIs must obey or appear to obey normal Linux cache
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/