Re: SMP syncronization on AMD processors (broken?)

From: Kyle Moffett
Date: Tue Oct 11 2005 - 22:28:57 EST


On Oct 11, 2005, at 22:39:50, linux@xxxxxxxxxxx wrote:
This may work on some processors, but on others the read of "progress" in XXX, or the write in YYY may require arch-specific code to force the update out to other cpus.

Alternately, explicitly atomic operations should suffice, but a simple increment is probably not enough for portable code.

Er.. you mean, the pre-incremented value could be cached *indefinitely* by XXX? That seems odd...

I can see an arch hook (memory barrier sort of thuing) to push it out a bit faster, but are there architecures on which noticing the increment could be delayed indefinitely?

In particular, that same hook would already be used by the spin lock release sequence (to ensure that someone else notices the lock is now available), and unless it's address-specific, it would do for the "progress" counter as well.

Umm, IIRC, some architectures (don't remember which ones, but I'd guess it's the big 512-way boxen) have cache-line-and-memory models such that a cacheline may remain out-of-date indefinitely unless the CPU with the update runs a "cache-line flush" instruction or the CPU who wants an update requests one with an exclusive cacheline lock or similar. On such a system, the only way to ensure safe distribution of data between CPUs is to make sure it's in the same cacheline as the spinlock (and document that fact) or use special instructions to verify coherency.

Cheers,
Kyle Moffett

--
Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it.
-- Brian Kernighan


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/