the patch below further optimizes the x86 context-switch code. On P6 class
CPUs it takes 36 cycles to reload %fs and %gs, while both registers are
zero in most of the cases. (exceptions are LinuxThreads, Wine, DOSEMU,
etc.) It's much cheaper to check whether any non-zero %fs or %gs selector
is involved, and do the segment reload in that case only.
when applied to 2.5.3-pre5, the patch achieves a 2.5% improvement in
2-task lat_ctx context-switch performance.
(i've also added unlikely() constructs to context-switch slow paths. The
performance improvement was measured on kernel compiled with a 2.x gcc
compiler that does not use the unlikely() extension yet.)
Ingo
--- linux/arch/i386/kernel/process.c.orig Sun Jan 27 15:33:29 2002
+++ linux/arch/i386/kernel/process.c Sun Jan 27 16:07:45 2002
@@ -689,15 +689,17 @@
asm volatile("movl %%gs,%0":"=m" (*(int *)&prev->gs));
/*
- * Restore %fs and %gs.
+ * Restore %fs and %gs if needed.
*/
- loadsegment(fs, next->fs);
- loadsegment(gs, next->gs);
+ if (unlikely(prev->fs | prev->gs | next->fs | next->gs)) {
+ loadsegment(fs, next->fs);
+ loadsegment(gs, next->gs);
+ }
/*
* Now maybe reload the debug registers
*/
- if (next->debugreg[7]){
+ if (unlikely(next->debugreg[7])) {
loaddebug(next, 0);
loaddebug(next, 1);
loaddebug(next, 2);
@@ -707,7 +709,7 @@
loaddebug(next, 7);
}
- if (prev->ioperm || next->ioperm) {
+ if (unlikely(prev->ioperm || next->ioperm)) {
if (next->ioperm) {
/*
* 4 cachelines copy ... not good, but not that
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
This archive was generated by hypermail 2b29 : Thu Jan 31 2002 - 21:00:44 EST