On Sun, 9 Jun 1996, Linus Torvalds wrote:
>
>
> On Sat, 8 Jun 1996, Robert L Krawitz wrote:
> >
> > 643 23.78% 00191324 csum_partial_copy_fromuser
> > 997 36.88% 001369c8 memcpy_toiovec
> > 2703 100.00% 00000000 total
> >
> > That's very, very interesting. Somehow the checksum routine was
> > faster than the raw memcpy routine.
>
> Yes. But there could be secondary stuff here that doesn't show up, like
> the block being in the cache before entering the system call, but the TCP
> processing fliushing the cache enough that by the time we copy it out to
> user space we have nothing cached any more..
>
> > (This is on a P166 with a reasonably good memory subsystem, and the
> > machine was sending 500MB of data over TCP loopback).
> >
> > I presume that means EDO RAM. If so, I'm guessing that
> > csum_partial_copy_fromuser was running at about 45 MB/sec, and hence
> > your overall throughput was something like 11 MB/sec?
>
> I actually get closer to 20MB/s over TCP loopback. It's EDO RAM and sync
> burst cache etc. 20MB/s over TCP isn't bad on a machine that does memcpy
> at 43MB/s..
>
> Linus
>
Your explanations about memory speed are very interesting, and the results
with TCP loopback are impressive.
I have done some memory speed tests under X11 with unix sockets.
My new hardware is the following:
-TYAN tomcat I, P133, 512K piplined, 32MB EDO, Diamond 968 2MB Vram, etc...
Under X11, the maximum speed I should get with "x11perf putimage500"
should be about 45 images/sec.
Under 1.99.8, I get about 27 images/sec.
Under 1.2.13, with a very short patch that changes the algorithm
"head/tail" to "start/end" and allows more than 1 page for the circular
buffers of old unix sockets, I got the following results:
- 8K buffer: 37 images/sec
- 16k buffer: 41 images/sec
The current unix sockets eat more than 40 % of the memory throughput.
I got about 90 % of the bandwitch with the old unix sockets and
16K+16K buffer for each client.
I donnot have interesting comments about those results.
However it seems to me that the current unix socket should be improved
in order to make available for local X11 applications more than 80 %
of the memory speed.
Gerard.