Correct. The optimization will work just fine, and it increases
performance significantly. After the LOCK# lead gets pulled, you will
see tons of non-cacheable memory references on an analyzer as the
processor re-fills the pipelines and internal write buffers. It's
heavier than a TLB flush. My observations indicated that the PPro would
loose 24+ clock cycles (and depending on the Memory Bus Controler
chipset on your motherboard, even more) of time to recover after a LOCK#
assertion.
NetWare 4/5 uses this optimization in it's spinlocks. It works fine and
boosts performance. On Intel machines, LOCK#'s are heavier than they
need to be becuase of all the issues Intel has with people writing
self-modifying code (about 60% of their errata deals with this problem).
Jeff
>
> -- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/