Re: [PATCH] splice support #2

From: Linus Torvalds
Date: Thu Mar 30 2006 - 12:14:33 EST




On Thu, 30 Mar 2006, Linus Torvalds wrote:
>
> Actually, there _is_ a fundamental problem. Two of them, in fact.

Actually, four.

The third reason the pipe buffer is so useful is that it's literally a
stream with no position.

That may sound bad, but it's actually a huge deal. It's why standard unix
pipelines are so powerful. You don't pass around "this file, this offset,
this length" - you pass around a simple fd, and you can feed that fd data
without ever having to worry about what the reader of the data does. The
reader cannot seek around to places that you didn't want him to see, and
the reader cannot get confused about where the end is.

The 4th reason is "tee". Again, you _could_ perhaps do "tee" without the
pipe, but it would be a total nightmare. Now, tee isn't that common, but
it does happen, and in particular it happens a lot with certain streaming
content.

Doing a "tee" with regular pipes is not that common: you usually just use
it for debugging or logging (ie you have a pipeline you want to debug, and
inserting "tee" in the middle is a good way to keep the same pipeline, but
also being able to look at the intermediate data when something went
wrong).

However, one reason "tee" _isn't_ that common with regular pipe usage is
that normal programs never need to do that anyway: all the pipe data
always goes through user space, so you can trivially do a "tee" inside of
the application itself without any external support. You just log the data
as you receive it.

But with splice(), the whole _point_ of the system call is that at least a
portion of the data never hits a user space buffer at all. Which means
that suddenly "tee" becomes much more important, because it's the _only_
way to insert a point where you can do logging/debugging of the data.

Now, I didn't do the "tee()" system call in my initial example thing, and
Jens didn't add it either, but I described it back in Jan-2005 with the
original description. It really is very fundamental if you ever want to
have a "plugin" kind of model, where you plug in different users to the
same data stream.

The canonical example is getting video input from an mpeg encoder, and
_both_ saving it to a file and sending it on in real-time to the app that
shows it in a window. Again, having the pipe is what allows this.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/