Re: [patch] __volatile__ needed in get_cycles()?

Andrea Arcangeli (andrea@e-mind.com)
Tue, 30 Mar 1999 19:28:15 +0200 (CEST)


On Tue, 30 Mar 1999, Ingo Molnar wrote:

>mb() does locked instructions:
>
>#define mb() __asm__ __volatile__ ("lock; addl $0,0(%%esp)": : :"memory")

mb() infact won't serialize the instruction fetch part of the cpu. So
adding a lock; addl $0,0(%%esp) _before_ rdtsc won't avoid the CPU to read
the rdtsc instruction before the addl is complted and adding it _after_
the rdtsc won't avoid the cpu to start fetching new instruction before
rdtsc is completed. So if somebody want to have a really perfect
get_cycles() he has to use a _serializing_ instruction like cpuid around
rdtsc.

This is also the reason for which locking the bus is faster than cpuid:
because locking the bus will only force data ordering and that's _exactly_
the _only_ thing we want to achieve with rmb() (and we also want to flush
old memory values cached in registers currently and this handled by
"memory" clobbered).

Note also that rdtsc just has an offset of latency (it was 9 cycles with
my P5MMX if I remeber well) and so adding some overhead due cpuid is not
an issue at all (since to get a perfect value you _just_ had to subtract
the rdtsc latency, and now you'll have to subtract the
get_cycles_ordered() latency instead...).

So I did a new patch against 2.2.5:

Index: include/asm-i386/timex.h
===================================================================
RCS file: /var/cvs/linux/include/asm-i386/timex.h,v
retrieving revision 1.1.1.2
diff -u -r1.1.1.2 timex.h
--- timex.h 1999/03/29 21:39:41 1.1.1.2
+++ linux/include/asm-i386/timex.h 1999/03/30 17:29:01
@@ -44,4 +44,15 @@
#endif
}

+static inline cycles_t get_cycles_serialized(void)
+{
+ cycles_t ret;
+
+ serialize_cpu();
+ ret = get_cycles();
+ serialize_cpu();
+
+ return ret;
+}
+
#endif
Index: include/asm-i386/system.h
===================================================================
RCS file: /var/cvs/linux/include/asm-i386/system.h,v
retrieving revision 1.1.1.1
diff -u -r1.1.1.1 system.h
--- system.h 1999/01/18 01:27:15 1.1.1.1
+++ linux/include/asm-i386/system.h 1999/03/30 17:16:54
@@ -179,6 +179,16 @@
#define rmb() mb()
#define wmb() __asm__ __volatile__ ("": : :"memory")

+/*
+ * Force total cpu ordering (also instruction fetching won't pass this
+ * serializing instruction). -arca
+ */
+#define serialize_cpu() \
+ __asm__ __volatile__("cpuid" \
+ : \
+ : "a" (0) \
+ : "eax", "ebx", "ecx", "edx");
+
/* interrupt control.. */
#define __sti() __asm__ __volatile__ ("sti": : :"memory")
#define __cli() __asm__ __volatile__ ("cli": : :"memory")

Comments?

Andrea Arcangeli

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/