Re: Help on TCP Output/Input

Zachary Amsden (zamsden@cthulhu.engr.sgi.com)
Thu, 07 Oct 1999 13:48:05 -0700


>From: Koushik Chakraborty <kchak@iitk.ac.in>

>Hi,
> I am developing an application which needs certain support from kernel
>(via system call) to speed up data transfer from one socket to one or more
>sockets. I don't want to use the system calls like read,write as this
>involves too many buffer copies and also context switches (from user to
>kernel). Instead, I want the kernel to read the data from sk_buff que of
>the source sockets and push directly into those for the destination
>sockets.

snip

This is not possible with the current kernel API. The best you can get
is file->network transfers using sendfile. What you are looking for is
something similar to the splice() system call, or a zero-copy I/O path.
Zero copy I/O is not possible without hardware support of some kind, as
even if we can pull VM games, we still need to checksum.

> A> At which point in the TCP implementaion, do you have ordered,
>non-duplicated data stream (as if ready to be consumed by the
>application).

As soon as an ACK is sent or the system tries to delay an ACK and
piggyback it onto another.

> B> Is it possible to retrive this data stream and form into sk_buff and
>subsequently call tcp_write for the outgoing sockets ? I mean will it
>actually take care of ACK's and sliding-window management ?

Not from userspace.

> C> Can anybody suggest any other or possibly best way to achieve this ?

Expect to see this sometime in the next month or two available as a
patch.

High-performance I/O and Networking Software in Sequoia 2000
by Joseph Pasquale, Eric W. Anderson, Kevin Fall, Jonathan S. Kay
http://www.asia-pacific.digital.com/DTJJ06/DTJJ06SC.TXT

Cues: File Subsystem Data Streams[1]
Sandeep Gupta [2], John S. Baras [4], Stephen Kelley, Nick Roussopoulos
[2,3]
http://www.umiacs.umd.edu/~sandeep/HTML/Cues/paper.html

A case for in-kernel data streaming over the file subsystem [a1]
http://www.umiacs.umd.edu/~sandeep/HTML/CaseforFSstreams/tr-case.html

V. S. Pai, P. Druschel, W. Zwaenepoel, "IO-Lite: A unified I/O buffering
and caching system," Rice University CS Technical Report TR97-294.
http://www.cs.rice.edu/~vivek/IO-Lite/TR97-294.ps

The original Splice Paper
Larry McVoy
http://www.bitmover.com/lm/papers/splice.ps

Zachary Amsden
zamsden@engr.sgi.com

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/