[patch] exit_free(), 2.5.31-A0

From: Ingo Molnar (mingo@elte.hu)
Date: Tue Aug 13 2002 - 10:25:33 EST


the attached patch implements a new syscall, exit_free():

        exit_free(error_code, addr, val);

this syscall is used as a performance optimization in glibc's threading
library.

upon exiting of child threads there is an ugly race condition that must be
solved: the freeing of the child stack. Old pthreads used to set up a
trampoline, jump to it, munmap() the child stack and call sys_exit(). It
does not have to be detailed how much this hurts performance: the
munmap(), besides being slow for such a lightweight thing as thread-exit,
also flushes the TLB, possibly across CPUs. Also, locality of reference of
thread stacks is lost as well.

the new thread library solves this performance by introducing a 'thread
stack cache' - a simple list of thread stacks. (with some more details to
handle different stack sizes.) The problem with the stack cache is that
there's a race condition: who releases it? We must not free the thread
stack up until the very last instruction the current thread executes,
because a signal handler might arrive and might use an alrady freed stack.
Other threads might pick this stack and overwrite it ... Disabling signals
upon thread-exit adds a syscall overhead, but it still doesnt solve the
fundamental problem of 'who frees the stack'. A global semaphore for a
trampoline stack would have to be introduced to be able to free the stack
safely - a messy, slow and unscalable solution. Or queueing the stack to a
helper thread is equally messy.

again the kernel can give the threading library a helping hand to solve
this catch-22 problem, surprisingly easily in this case as well.
exit_free() simply writes a user-provided word back to userspace. At that
point the user stack is not used anymore (nor will it ever be - sys_exit()
cannot fail), so freeing it is appropriate. The actual way glibc utilizes
exit_free() is that there's a "is this stack free" flag in the thread
stack control structure, and exit_free() sets this to '1'. So upon
thread-exit the thread frees the stack and puts it into the stack cache -
but other threads will skip over it because the 'free' flag is still 0.

with this syscall it was possible to implement single-syscall thread exit
in glibc. (well, actually not yet, the next patch i send will enable this
fully by solving the "who does the waitpid()?" problem.)

        Ingo

--- linux/arch/i386/kernel/entry.S.orig Tue Aug 13 17:13:30 2002
+++ linux/arch/i386/kernel/entry.S Tue Aug 13 17:13:42 2002
@@ -755,6 +755,7 @@
         .long sys_set_thread_area
         .long sys_get_thread_area
         .long sys_clone_startup /* 245 */
+ .long sys_exit_free
 
         .rept NR_syscalls-(.-sys_call_table)/4
                 .long sys_ni_syscall
--- linux/include/asm-i386/unistd.h.orig Tue Aug 13 17:13:00 2002
+++ linux/include/asm-i386/unistd.h Tue Aug 13 17:13:17 2002
@@ -250,6 +250,7 @@
 #define __NR_set_thread_area 243
 #define __NR_get_thread_area 244
 #define __NR_clone_startup 245
+#define __NR_exit_free 246
 
 /* user-visible error numbers are in the range -1 - -124: see <asm-i386/errno.h> */
 
--- linux/kernel/exit.c.orig Tue Aug 13 17:12:06 2002
+++ linux/kernel/exit.c Tue Aug 13 17:12:45 2002
@@ -586,6 +586,12 @@
         do_exit((error_code&0xff)<<8);
 }
 
+asmlinkage long sys_exit_free(int error_code, unsigned long *addr, unsigned long val)
+{
+ put_user(val, addr);
+ do_exit((error_code&0xff)<<8);
+}
+
 asmlinkage long sys_wait4(pid_t pid,unsigned int * stat_addr, int options, struct rusage * ru)
 {
         int flag, retval;

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 15 2002 - 22:00:32 EST