Re: mmap() versus read()

Perry Harrington (pedward@sun4.apsoft.com)
Tue, 10 Mar 1998 22:50:06 -0800 (PST)


> >
> > PPC Linux does this in the idle task -- a big win. So if the problem is speed
> > of clone,then we should speed up clone.
>
> clone() really _is_ very lightweight. I think that some of the confusion
> comes from the timings of "pthread_create()" which is much less
> lightweight partly due to memory allocation issues, partly just because
> the Linux clone-based pthreads library hasn't had all that much time to be
> optimized etc.

You are correct, I was confusing the cost of the Linux threads pthread_create.
This came about because I've had experience with the MIT (provenzano) threads,
and they had a significantly speedier/less costly creation, IIRC.

>
> I'd be very surprised if a basic clone() call is any slower than even the
> lightest-weight Solaris kernel threads: there really is almost no overhead
> at all: most of it is a fairly simple memory allocation plus some slight
> copying of kernel state to the new thread.

You are probably correct. I still think that A LOT of things in the system
could benefit from having pre-zeroed pages. Perhaps skbuff initialization
would be magnitudes faster??? process creation would be faster, due to
less iterative initialization. High speed processor specific code could
make this cheaper, I seem to recall reading that the MMX extensions (sorry
if this seems like a 'kick', it's not) can do the same operation on multiple
memory chunks, in parallel. My jury is still out on whether MMX is all it's
cracked up to be...

>
> And partly _because_ Linux kernel threads created with clone() are not a
> special case but a first-class citizen with any other process, a lot of
> code is actually simplified and speeded up. For example, the context
> switching doesn't have any special cases etc (it has a weight factor that
> tries to give priority to threads that share the same memory management,
> but that's a fairly simple mechanism for better performance).

I read somewhare that sharing of the PPID was going to be implemented for
clone(), how is this done, and what release? I'd be interested in hacking
together the 'right' support for threads (appropriate signal masks and
delivery). The other feature of LWPs, that has been lost in this discussion,
is that multiple threads are scheduled upone a single LWP. In traditional
Solaris, it allocates LWPs sparsely. The only reason to do a 1:1 binding,
or specify a binding, is that LWPs are the entities block on system calls.
As long as you're not calling any calls that could block, LWPs can efficiently
schedule threads. This is advantageous in that thread creation is REALLY
inexpensive, because it doesn't create a new LWP every time. I think that
this is the secret to an appropriate thread implmentation thats "lightweight".

Here is a direct quote from "Threads Primer: A guide to multithreaded
programming" by Bil Lewis and Daniel J Berg, former Sun engineers:
(p41)

Many Threads on One LWP

The first technique is known as the "Many-to-One" model. It is also
known as "co-routining". Numerous threads are created in user space,
and they all take turns running on the onw LWP. Programming on such
a model will give to a superior programming paradigm, but running
your program on an MP manchine will not give you any speedup, and
when you make a blocking system call, the whole process [LWP] will
block. However, the thread creation, scheduling, and syncronization
is all done 100% in user space, so it's fast and cheap and uses no
kernel resources.

I think that maintaining the current clone() call for creating 'LWPs' is
cool, however the signal handling and such needs to be brought into spec
to make it robust. I would be interested in working on this, however I
would need a starting pointer or two. Additionally, doing a 1:1 thread
-> LWP mapping is costly. The creation of a thread could be significantly
improved if the Solaris model was implemented. This of course is another
conversation... (for those kneejerk reactions, don't fret, the Solaris
API allows you to tune this to a 1:1 mapping, I had to do this for a program
that did a mass amount of reverse DNS). Having a Solaris compatible libthread
interface would be another step to making Linux acceptable in the "corporate"
world. Robust threading is a prime requirement for mission critical programs
that would otherwise occupy a Sparc...

>
> I'm sure people can find threads that are even more light-weight than
> clone(), but I don't think they'll be as generic as clone(), and I doubt
> they'll be _much_ faster.

Of course userspace implementations would be slower. Scheduling of multipl
threads on one LWP could be fairly simple, just have a task queue that gets
maitained for each LWP's Nth context switch, pushing the xxxx_yield() direct
to the bottom, and expiring any running threads. I can provide some specifics
on the Solaris scheduling method, from my book.

FWIW, this book mentions DEC Unix's light and heavyweight threads. light
being the userspace, heavy being a userspace bound to on LWP.

I hope that this doesn't all seem like argument, but rather constructive
discussion.

>
> Linus
>

--Perry

-- 
Perry Harrington       Linux rules all OSes.    APSoft      ()
email: perry@apsoft.com 			Think Blue. /\

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu