Re: [PATCH] workqueue: Fix memory ordering race in queue_work*()

From: Linus Torvalds
Date: Tue Aug 16 2022 - 04:22:16 EST


On Mon, Aug 15, 2022 at 10:49 PM Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx> wrote:
>
> I think this is the source of all this: [ commit 61e02392d3c7 ]

Yes, that doc update seems to actually predate the broken code.

But you can actually see how that commit clearly was only meant to
change the "test_and_set_bit_lock()" case. IOW, in that commit, the
test_and_set_bit() comments clearly say

* test_and_set_bit - Set a bit and return its old value
* This operation is atomic and cannot be reordered.
* It may be reordered on other architectures than x86.
* It also implies a memory barrier.

(that "It may be reordered on other architectures than x86" does show
clear confusion, since it also says "It also implies a memory
barrier")

And the very commit that adds that

- RMW operations that are conditional are unordered on FAILURE,

language then updates the comment only above test_and_set_bit_lock().

And yes, on failure to lock, ordering doesn't matter. You didn't get
the bit lock, there is no ordering for you.

But then I think that mistake in the documentation (the relaxed
semantics was only for the "lock" version of the bitops, not for the
rest ot it) then later led to the mistake in the code.

End result: test_and_set_bit_lock() does indeed only imply an ordering
if it got the lock (ie it was successful).

But test_and_set_bit() itself is always "succesful". If the bit was
already set, that's not any indication of any "failure". It's just an
indication that it was set. Nothing more, nothing less, and the memory
barrier is still required regardless.

Linus