[PATCH] include/asm-i386/semaphore.h optimizations

Artur Skawina (skawina@geocities.com)
Sat, 24 Jul 1999 18:14:38 +0200


This is a multi-part message in MIME format.

--------------34A3D2677F0023817EA3BE35
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

The patch

o changes "inc 0(ecx)" to "inc (ecx)". gas does not change the
addressing mode in this case - why waste a full byte of icache
for every semaphore op?.. :)

o makes the contention case use 'conventional' calling style.
just for fun, and to see how much it matters on modern cpus
i benchmarked
asm("call 2f; jmp 1f; nop\n2: ret\n1:" );
vs
asm("pushl $1f; jmp 2f; nop\n2: ret\n1:" );
The latter is up to six times slower here. [1]
For more normal code the difference is obviously
much smaller, but the effects on the btbs are probably
still there.

artur

[1]
239 443 444 43 32 : 0 callcall
1715 2660 2660 286 382 : 0 calljmp

--------------34A3D2677F0023817EA3BE35
Content-Type: text/plain; charset=us-ascii; name="ttt"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline; filename="ttt"

diff -urNp --exclude-from /usr/src/lkdontdiff /img/linux-2.3.12p1/include/asm-i386/semaphore.h linux-2.3.12p1as/include/asm-i386/semaphore.h
--- /img/linux-2.3.12p1/include/asm-i386/semaphore.h Sat Jul 24 08:48:44 1999
+++ linux-2.3.12p1as/include/asm-i386/semaphore.h Sat Jul 24 15:51:57 1999
@@ -17,6 +17,10 @@
* potential and subtle race discovered by Ulrich Schmid
* in down_interruptible(). Since I started to play here I
* also implemented the `trylock' semaphore operation.
+ * 1999-07-02 Artur Skawina <skawina@geocities.com>
+ * Optimized "0(ecx)" -> "(ecx)" (the assembler does not
+ * do this). Changed calling sequences from push/jmp to
+ * traditional call/ret.
*
* If you would like to see an analysis of this implementation, please
* ftp to gcom.com and download the file
@@ -112,12 +116,12 @@ extern inline void down(struct semaphore
#ifdef __SMP__
"lock ; "
#endif
- "decl 0(%0)\n\t"
+ "decl (%0)\n\t" /* --sem->count */
"js 2f\n"
"1:\n"
".section .text.lock,\"ax\"\n"
- "2:\tpushl $1b\n\t"
- "jmp __down_failed\n"
+ "2:\tcall __down_failed\n\t"
+ "jmp 1b\n"
".previous"
:/* no outputs */
:"c" (sem)
@@ -137,13 +141,13 @@ extern inline int down_interruptible(str
#ifdef __SMP__
"lock ; "
#endif
- "decl 0(%1)\n\t"
+ "decl (%1)\n\t" /* --sem->count */
"js 2f\n\t"
"xorl %0,%0\n"
"1:\n"
".section .text.lock,\"ax\"\n"
- "2:\tpushl $1b\n\t"
- "jmp __down_failed_interruptible\n"
+ "2:\tcall __down_failed_interruptible\n\t"
+ "jmp 1b\n"
".previous"
:"=a" (result)
:"c" (sem)
@@ -164,13 +168,13 @@ extern inline int down_trylock(struct se
#ifdef __SMP__
"lock ; "
#endif
- "decl 0(%1)\n\t"
+ "decl (%1)\n\t" /* --sem->count */
"js 2f\n\t"
"xorl %0,%0\n"
"1:\n"
".section .text.lock,\"ax\"\n"
- "2:\tpushl $1b\n\t"
- "jmp __down_failed_trylock\n"
+ "2:\tcall __down_failed_trylock\n\t"
+ "jmp 1b\n"
".previous"
:"=a" (result)
:"c" (sem)
@@ -194,12 +198,12 @@ extern inline void up(struct semaphore *
#ifdef __SMP__
"lock ; "
#endif
- "incl 0(%0)\n\t"
+ "incl (%0)\n\t" /* ++sem->count */
"jle 2f\n"
"1:\n"
".section .text.lock,\"ax\"\n"
- "2:\tpushl $1b\n\t"
- "jmp __up_wakeup\n"
+ "2:\tcall __up_wakeup\n\t"
+ "jmp 1b\n"
".previous"
:/* no outputs */
:"c" (sem)

--------------34A3D2677F0023817EA3BE35--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/