Already done, actually. The overhead for context switching between
threads is minimal: on the x86 this is handled by the hardware (no TLB
flush when the page table pointers match), on other architectures (alpha,
sparc), the context switch routines notice it automatically.
The only thing that needs doing is to give some "bonus points" for the
scheduling code to threads that share the same mm space, in order for the
scheduler to know that it should prefer scheduling threads after each
other because the switch is low-overhead. This is trivial to do, look at
the "goodness()" function in kernel/sched.c, where it does something like
this:
/* .. and a slight advantage to the current process */
if (p == prev)
weight += 1;
which should probably be
/* .. and a slight advantage to same VM setup */
if (p->mm == prev->mm)
weight += 1;
but I haven't actually tried that out..
> There are probably some changes that could be made to the kernel to
> lower the overhead of switching between two threads of the same
> process. The one that I can think of is sharing of all thread invariant
> task_struct data. When the clone() ( do_fork() ) routine is called
> a new struct task_struct is allocated, and the clone()'ing process's
> entire task_struct is copied.
No. Only the per-thread data is copied, and the thread-invariant stuff is
already shared. Look at the "struct mm_struct", "struct files_struct" and
"struct signal_struct" etc pointers in the task structure. A clone() that
shares those structures just increments a usage pointer instead of copying
anything.
In short, the kernel should do all of this correctly already, the only thing
lacking is the testing part (and as a part of testing people may find some
things that could be handled better, I hope the pthreads interface will allow
more people to test this all out).
Linus