Jamie Lokier wrote:
>
> Jeff V. Merkey wrote:
> > The best info I know of is to get an analyser that plugs into the
> > processor socket (like an american arium) and enable branch trace
> > messaging to monitor the interaction between the processor and the cache
> > controllers. You get info that's not in any Intel book just watching
> > the thing run. Nasty complicated stuff. They explain some of it in the
> > cache controller architecture manuals -- these are public.
>
> I still don't see how processor traces will tell me what ordering
> guarantees I can rely on across the range of processors.
On the unlock case ordering doesn't really matter, except in the case of
"hotlocking" when several processors are all hitting the same spinlock
and blocking at it a lot, and whether you use "lock bts" or "mov <addr>"
does not matter, even in this case BTW, since there's no guarantee
enforced in the hardware as to which processor will get the lock next --
it's arbitrary.
The assumption made here is that when the lock has been taken and is
already owned, everyone is already spinning on it and blocked anyway.
If you unlock with a "mov <addr>,0" the cache controllers will update
the other cache lines after the write propogates, and another processor
will succeed in getting the lock. It doesn't matter if it shows up
right away accross everyone's cache lines (which is does anyway) -- one
of the processors will take the lock and get through as soon as the
R/M/W cycle completed (unless of course the lock field is not dword
aligned, in which case, a split bus transaction may occur and show up on
the bus as two atomic transactions instead of one -- Although from what
I have seen on an analyzer, Intel will hold LOCK# lead driven low until
both cycles complete). This means it completely unnecessary to assert
LOCK# for the unlock case, since there are no ordering issues persay -
the other processors are spinning on the lock already and cannot get
through.
Which processor updates it's cache line first will "kick" the cache
controller into signalling the others, and from what I have observed, it
seems random enough to ensure fairness to processors waiting on the
lock. Early ca-1994 Intel SMP systems (like Tricord) had some problems
with the "mov <addr>,0" method due to timing problems with their own MBC
chips they used on processor boards and I saw similiar problems with
DEC's IOAPIC clone chipsets early on, but these problems got fixed over
time. There is a side affect from the "mov <addr>,0" method where it's
possible for a particular processor to never get the lock if they are
all "hotlocking" on the same lock in memory and spin 1,000,000 times or
something do to early hardware designs, but I have not seen this problem
since mid-1994 on any Intel SMP system.
:-)
Jeff
>
> -- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Fri Sep 15 2000 - 21:00:15 EST