Re: zero-copy TCP

From: Jamie Lokier (lk@tantalophile.demon.co.uk)
Date: Mon Sep 04 2000 - 08:21:59 EST


Linus Torvalds wrote:
> (The "invalidate on write" is the sane way of doing SMP cache coherency,
> which is probably why. Trying to have shared dirty cache-lines is just
> not a viable option in the end).

With DMA from a device -- "snoop and update" still results in only one
owner of the dirty cache-lines: the CPU. (Even SMP). But you are right
that it isn't implemented that way.

> Basically any copy <= 4 cache lines is "free" compared to trying to be
> clever.
n
We're obviously interested in larger packets than 128 bytes.

> If you truly wan tto do zero-copy from user space and get real UNIX
> semantics for writes(), you'd better protect the page somehow, so that
> if the data hasn't made it out to the network (or needed to be
> re-sent) by the time the system call returns [...]

That's why the data is DMA'd to the card immediately, and that _card_
retains the data at least for the short term. Long term if it's still
retained and the card runs out of memory, the card DMAs old buffers back
to a kernel skbuff. This is one way to avoid TLBs.

> That's when you get into TLB invalidates etc. By which time you're
> talking another few cache invalidates, and possibly some nasty cross-CPU
> calls for SMP TLB coherency.

If you have to do a TLB invalidate then yes of course it's a loss.
That must be avoided :-)

> People who claim "zero-copy" is a great thing often ignore the costs of
> _not_ copying altogether.

Some people who claim zero-copy is great have done actual measurements
and it does look good for reasonable size packets. Even though the raw
performance doesn't look that much better, the CPU utilisation does so
you can actually _calculate_ a bit more with your data.

However, I've not seen any evidence that it's a good idea with the
standard unix APIs. I suspect everyone will agree on that :-)

-- Jamie
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Sep 07 2000 - 21:00:17 EST