Note that the _fundamental_ problem is really that intel doesn't know the
concept of having a "memory barrier". The right way to solve this problem
is really to have a memory barrier between the two instructions, but the
way intel thinks about these things they only have a concept of a
"synchronization point".
If you read an alpha manual, you'll notice that the alpha people know what
the issue is, and you'll find that an alpha has three different kinds of
memory barriers: read barriers, write barriers and a read-write barrier
(which is usually just called a "memory barrier" and which acts the way an
intel "synchronization point" acts wrt memory ordering).
So yes, the fundamental problem is not instruction re-ordering, but memory
access re-ordering. But on intel, the only way you can handle this is with
one of these synchronization points (and they are expensive - cpuid is one
of the cheapest ones, but it trashes a lot of registers which results in
secondary expenses due to reloads etc that the CPU can't even optimize due
to the very same memory ordering constraints that the cpuid is there to
add..).
Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu