Re: unusual startup messages

Craig Milo Rogers (rogers@isi.edu)
Fri, 01 Nov 96 11:30:14 PST


>Yes, you can do it in user space, but your performance will suck unless you
>ignore security issues or have special hardware (and the special hardware
>would essentially have to do 90% of the stuff we do in kernel space now: I'm
>talking _really_ special hardware).

Actually, the hardware needn't be that special. It mainly
needs to identify "normal" incoming TCP and UDP packets, and store
them (via DMA or shared memory) directly into buffers that are mapped
into the corresponding user processes; this may be done with a
high-speed state machine and socket lookup hash table. It is also
desirable to have the hardware enforce certain security and sanity
checks on outgoing packets; this can be done with a template
mechanism. Finally, it is desireable to have the hardware calculate
the TCP/UCP checksums, since there's no longer a kernel-level copy in
which to hide the calculation.

The hardware functions correspond to part of the the "fast
path" of a high-performance Internet stack. The rest of the Internet
stack can be implemented in user space.

>It's been talked about (favourite thing in microkernels is to move critical
>resources to places they don't belong ;), but I doubt _anybody_ seriously
>thinks that it can be done in user space while keeping up with things like
>HIPPI and having full memory protection.

The Netstation and Atomic-2 projects at ISI believe it is
possible. (Netstation is primarilly directed at Internet-adressible
peripherals; think of your processor, display adaptor, and disks
as each having their own IP addresses. Atomic-2 has been investigating
user-level protocol APIs.)

http://www.isi.edu/div7/netstation/
http://www.isi.edu/div7/atomic2/

>Oh, and the big problem isn't throughput: you can essentially get throughput
>by having large buffers (modulo the problems with maxing out your memory
>bandwidth which in itself is a large problem). The _big_ problem is latency.

It is necessary to have real-time scheduling and switching of
user-level threads. Scheduling latencies should be adressed as part
of supporting POSIX (mumble), the real-time extensions. Switching
*does* poes latency problems, because of the overheads (instructions,
cache misses) of switching between multiple user contexts (instead of
switching in/out of a single kernel-level interrupt context). The
latency problem of the TCP/UDP checksum calculation can be overcome
with a modest hardware investment.

>In short, forget about networking in user space unless you're talking >10ms
>latencies and <10Mbps throughput. 10Mbps ethernet may still be realistic in
>user space. Just. But the world is moving to 100Mbps and beyond.

I'm not really up on the status of these projects, but I
believe that Atomic-2 has demonstrated (non-IP) user-level protocol
stacks operating in excess of 200 Mbps on Sun SPARC-20/71s. It is
believed (but has not, to my knowledge, been demonstrated) that the
same performance can be obtained for TCP- and UDP-based stacks.

Craig Milo Rogers