Buffering and batching (or pipelining) in systems design is a well known
technique (by some) for improving system performance, particularly when
latencies are high. X has done this for a long time (approaching 15 years),
it is part of HTTP/1.1, and appears elsewhere. We didn't consider it rocket
science even when we were doing the early X work over 14 years ago. Look
at VMS QIO and AST delivery for another (ugly) approach, but in my view
one that throws away almost all of the benefits as it still is doing
the system call transitions without the benefits of the buffering to
amortize the expense.
An X request has an instruction budget of 100 instructions or so total;
the only way that this is feasible is to avoide system calls like the
plague, and to amortize such expensive operations as read/write and select
over many X requests. I used to regularly characterize X an exercise
in avoiding system calls.
I will note, however, that interface (protocol) design has a major
impact in how this technique can/will work and it is hard to retrofit.
We worked pretty hard in X Version 11 design to avoid these problems,
but history has shown we didn't work hard enough.
An example is the X request "InternAtom", which is heavily used (much
more so that we originally thought it would be) and is the basis of alot
of X's extensibility for client/client and client/window manager
communication. InternAtom gives you a short "atom" name for a string
(and is used as an extensible type system for communcation). This is a
synchronous call, and has turned into a bottleneck (we built in alot of
basic atoms. With 20-20 hindsight, we should have chosen a suitable sized
hash function, and just always sent a hash, which would have allowed them
to always be client generated.
Here's the moral: buffering/batching can work REALLY well, but is BEST done
at design time, and hard/painful/impossible to retrofit later. It can often
cause VERY great performance increments (for HTTP/1.1, for example, where
it turned out to be possible to retrofit to some extent, it can allow
for a factor of 2-10 performance improvement from our measurments). Whether
it would make any sense to try to retrofit anything approximating UNIX
system call semantics onto such a base is far from clear to me at all...
So if you want to do this when designing a system, think about it first,
not later, and think about it hard!
- Jim
-- Jim Gettys Compaq Computer Corporation Visting Scientist, World Wide Web Consortium, M.I.T. http://www.w3.org/People/Gettys/ jg@w3.org, jg@pa.dec.com
- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/