Re: zero-copy TCP fileserving

Richard B. Johnson (root@chaos.analogic.com)
Thu, 3 Jun 1999 13:34:09 -0400 (EDT)


On Thu, 3 Jun 1999, Bjorn Wesen wrote:

> On Thu, 3 Jun 1999, Alan Cox wrote:
> > If the CPU can queue posted writes for the data and is still able to feed
> > itself without stalling, and there is enough bandwidth left for any other
> > bus masters _in the unlikely event it hits main memory_ then the write is
> > pretty cheap. Since we are going to want the data again for the actual
> > I/O transaction in most cases its extremely close to zero cost.
>
> But in the case of a heavy filetransfer, the data will need to hit main
> memory, especially if the NIC can bus-master DMA from main memory. The
> thing is, in a good system, the HD controller will do BM DMA to main
> memory and there should be no need at all for the CPU to touch the data.
> DMA in, DMA out. Checksum can be handled by the NIC. Now I'm not mad
> enough to claim that direct device<->device I/O is the way to go (which
> was discussed here recently :).
>
> I fully agree that a meagre 10 MByte/s (for filling a 100 MBit/s net) is
> a tiny load on a massive P2/400 with main memory bandwidths of hundreds of
> MB/s. But the issue was scalability mainly.
>

There really needs to be a change in TCP/IP RFQs v.s. local area networks
now that Ethernet is no longer the BW limiting item.

Something like:

(1) A checksum of zero means don't check the checksum.
(2) You can't route a zero-checksum IP packet. If you route it,
you must checksum it before routing.

This would allow LANs to operate at higher speeds where a hardware CRC
makes sure the data is good anyway, while maintaining compatibility
outside the local area. This would also put the total responsibility
for the IP checksum with the routers which are designed for it, or
slow communications links like PPP, ISDN, and T1, that are so slow
the additional overhead is lost in the noise.

I heard that at one time Sun didn't bother to checksum packets that
arrived via Ethernet. I don't know if it's true. Of course without
a rule-change this is problematical because a dirty PPP link could
be routed to an Ethernet LAN and be accepted even though the data
was corrupt.

Cheers,
Dick Johnson
***** FILE SYSTEM WAS MODIFIED *****
Penguin : Linux version 2.2.6 on an i686 machine (400.59 BogoMips).
Warning : It's hard to remain at the trailing edge of technology.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/