Pardon ?. If the PC one is "fast" the sparc one must be absolutely awful to
make it look that. Triton isnt bad but neptune definitely leaves me with
bandwidth problems, even single CPU.
> I figure if you do something like:
>
> load source ! source enters cache
> do checksum calculation ! fill the pipeline
> null load from dest ! dest enters cache even if no-wr-alloc
> store to destination
>
> If both source and dest keep the cache streaming data in, _and_
> continues to hold the destination by the time the store happens, you
> get a really nice copy bandwidth streaming effect (1gb/s as you
> mentioned on nice cache architectures.)
The MIPS checksum done by Van Jacobson is apparently a bit different, its
loading one cache line into registers while adding up the previous one.
Alan