Re: [PATCH RFC] x86: avoid atomic operation in test_and_set_bit_lockif possible

From: Linus Torvalds
Date: Thu Mar 24 2011 - 13:10:17 EST


On Wed, Mar 23, 2011 at 9:56 PM, Nikanth Karthikesan <knikanth@xxxxxxx> wrote:
> On x86_64 SMP with lots of CPU atomic instructions which assert the LOCK #
> signal can stall other CPUs. And as the number of cores increase this penalty
> scales proportionately. So it is best to try and avoid atomic instructions
> wherever possible. test_and_set_bit_lock() can avoid using LOCK_PREFIX if it
> finds the bit set already.

This is potentially _very_ wrong. It means that test_and_set_bit() is
no longer a serializing instruction in the failure case, and I wonder
what effect that will have on the thousands of users.

It also means that test_and_set_bit() on an uncached entry now starts
out with a read-for-ownership cache operation, which can be quite a
bit slower than the exclusive ownership thing for the hopefully common
case where it succeeds.

So no, I really think this is seriously wrong. It basically makes it
impossible for the user of the bitop function to do a good job if it
wants to.

WHICH test_and_set_bit() are you having performance issues with?
Because I think the right approach is to do this optimization on a
case-by-case basis in the code that actually does the operation, not
in the low-level routine.

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/