Re: zero-copy TCP fileserving

Linus Torvalds (torvalds@transmeta.com)
4 Jun 1999 07:59:21 GMT


In article <Pine.LNX.4.10.9906032141490.5973-100000@hoser>,
Zach Brown <zab@zabbo.net> wrote:
>
>my reading of the databook implies that the 905b downloads the packet into
>the fifo then tacks in the checksum..

Note that this can have nasty side issues, and can actually hurt
performance. In particular, latency for a single packet can be
seriously degraded by this "optimization".

Rationale: the memory bus is often much faster than the IO bus. In fact
it had better be, or it is badly designed. So let's say that it takes
10us of CPU-time to copy and checksum the packet, and then takes 100us
of real time (but the CPU can be doing something else - because by now
we're doing DMA's) to move the packet into the network card.

Doing the checksum by hand with a "stupid" network card, the network
card can start sending out the packet after 10us - it just does the DMA
read in parallel with the sending of the card, reading ahead by a small
amount.

The "intelligent" card takes 100us to start sending the packet out,
because it needs to get the whole packet on board before it can start
sending it out in order to have the correct checksum.

See?

And yes, the above is not just a theoretical issue. It actually does
happen. The "stupid" approach can be quite noticeably faster (and
that's not even taking into account that the intelligent approaches tend
to require a _lot_ more "infrastructure" processing to set everything
up).

So don't fall into the trap of thinking that zero-copy is always
obviously a win. It isn't.

Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/