Re: call_function_many: fix list delete vs add race

From: Linus Torvalds
Date: Mon Jan 31 2011 - 20:39:55 EST

On Tue, Feb 1, 2011 at 10:22 AM, Benjamin Herrenschmidt
<benh@xxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, 2011-01-31 at 22:17 +0100, Peter Zijlstra wrote:
>> That's wrong:
>>  ->foo =
>>  LOCK
>>  ->bar =
>> can be re-ordered as:
>>  LOCK
>>  ->bar =
>>  ->foo =
> Can it ? I though UNLOCK had a write barrier semantic ?

The re-ordering of ->foo and UNLOCK that Peter claims is definitely
possible, and the unlock is not guaranteed to be a write barrier. The
unlock just guarantees "release" consistency, which means that no
previous reads or writes will be migrated down from within the locked
region, but it is very much a one-way barrier, so a subsequent write -
or more commonly a read - could easily migrate up past the unlock.

(Same for lock, except the permeability is obviously the other way
around - "acquire" consistency)

> So yes, ->bar = can leak into the lock, as can ->foo =, but they can't
> be re-ordered vs. each other because the implied barrier will keep ->foo
> = in the same "domain" as the unlock itself.
> Or do other archs do something really nasty here that don't provide this
> guarantee ?

I think we actually allow the accesses to ->bar and ->foo to be
re-ordered wrt each other, exactly because they can *both* get
re-ordered first into the locked region, and then after that they can
get re-ordered wrt each other (because there is no other memory

So a "unlock+lock" is guaranteed to be equivalent to a full memory
barrier (because an operation before the unlock cannot pass the
unlock, and an access after the lock cannot percolate up before it).
But the regular "lock+unlock" sequence is not, exactly because
accesses outside of it are allowed to first leak in, and then not have
ordering constraints within the locked region.

That said, we may have some confusion there, and I would *STRONGLY*
suggest that architectures should have stronger lock consistency
guarantees than the theoretical ones. Especially since x86 has such
strong rules, and locking is a full memory barrier. Anybody with very
weak ordering is likely to just hit more bugs.

And so while I'm not sure it's ever been documented, I do think it is
likely a good idea to just make sure that "lock+unlock" is a full
memory barrier, the same way "unlock+lock" is. I think it's
practically true anyway on all architectures (with the possible
exception of ia64, which I think actually implements real
acquire/release semantics)

(Practically speaking, there really aren't many reasons to allow
writes to be re-ordered wrt lock/unlock operations, and there is no
reason to ever move reads later, only earlier. Writes are _not_
performance-sensitive the way reads are, so there is no real reason to
really take advantage of the one-way permeability for them. It's reads
that you really want to speculate and do early, not writes, and so
allowing a read after a UNLOCK to percolate into the critical region
is really the only relevant case from a performance perspective).

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at