On Sun, 2005-04-03 at 09:10 +0200, Manfred Spraul wrote:No. If everything is cpu-local, then there are obviously no cache line transfers. LOCK is not that expensive. On a Pentium 3, it was 20 cpu cycles. On an Athlon 64, it's virtually free.
Yes - sem or spin locks are quicker as long as no cache line transfers are necessary. If the semaphore is accessed by multiple cpus, then kmalloc would be faster: slab tries hard to avoid taking global locks. I'm not speaking about contention, just the cache line ping pong for acquiring a free semaphore.
Without contention, is there still a problem with cache line ping pong
of acquiring a free semaphore?
I mean, say only one task is using a given semaphore. Is there still
going to be cache line transfers for acquiring it? Even if the task in
question stays on a CPU. Is the "LOCK" on an instruction that expensive
even if the other CPUs haven't accessed that location of memory.