Re: SMP syncronization on AMD processors (broken?)
From: Kyle Moffett
Date: Tue Oct 11 2005 - 22:28:57 EST
On Oct 11, 2005, at 22:39:50, linux@xxxxxxxxxxx wrote:
This may work on some processors, but on others the read of
"progress" in XXX, or the write in YYY may require arch-specific
code to force the update out to other cpus.
Alternately, explicitly atomic operations should suffice, but a
simple increment is probably not enough for portable code.
Er.. you mean, the pre-incremented value could be cached
*indefinitely* by XXX? That seems odd...
I can see an arch hook (memory barrier sort of thuing) to push it
out a bit faster, but are there architecures on which noticing the
increment could be delayed indefinitely?
In particular, that same hook would already be used by the spin
lock release sequence (to ensure that someone else notices the lock
is now available), and unless it's address-specific, it would do
for the "progress" counter as well.
Umm, IIRC, some architectures (don't remember which ones, but I'd
guess it's the big 512-way boxen) have cache-line-and-memory models
such that a cacheline may remain out-of-date indefinitely unless the
CPU with the update runs a "cache-line flush" instruction or the CPU
who wants an update requests one with an exclusive cacheline lock or
similar. On such a system, the only way to ensure safe distribution
of data between CPUs is to make sure it's in the same cacheline as
the spinlock (and document that fact) or use special instructions to
Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are, by
definition, not smart enough to debug it.
-- Brian Kernighan
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/