Re: call_function_many: fix list delete vs add race

From: Milton Miller
Date: Mon Jan 31 2011 - 15:26:15 EST


On Mon, 31 Jan 2011 about 11:27:45 +0100, Peter Zijlstra wrote:
> On Fri, 2011-01-28 at 18:20 -0600, Milton Miller wrote:
> > Peter pointed out there was nothing preventing the list_del_rcu in
> > smp_call_function_interrupt from running before the list_add_rcu in
> > smp_call_function_many. Fix this by not setting refs until we have put
> > the entry on the list. We can use the lock acquire and release instead
> > of a wmb.
> >
> > Reported-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
> > Signed-off-by: Milton Miller <miltonm@xxxxxxx>
> > Cc: "Paul E. McKenney" <paulmck@xxxxxxxxxxxxxxxxxx>
> > ---
> >
> > I tried to force this race with a udelay before the lock & list_add and
> > by mixing all 64 online cpus with just 3 random cpus in the mask, but
> > was unsuccessful. Still, it seems to be a valid race, and the fix
> > is a simple change to the current code.
>
> Yes, I think this will fix it, I think simply putting that assignment
> under the lock is sufficient, because then the list removal will
> serialize again the list add. But placing it after the list add does
> also seem sufficient.
>
> Acked-by: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
>

I was worried some architectures would allow a write before the spinlock
to drop into the spinlock region, in which case the data or function
pointer could be found stale with the cpu mask bit set. The unlock
must flush all prior writes and therefore the new function and data
will be seen before refs is set.

milton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/