Re: Linux 2.6.17-rc2

From: Peter Naulls
Date: Wed Apr 19 2006 - 18:18:44 EST


Linus Torvalds wrote:

On Wed, 19 Apr 2006, Trond Myklebust wrote:
Any chance this could be adapted to work with all those DMA (and RDMA)
engines that litter our motherboards? I'm thinking in particular of
stuff like the drm drivers, and userspace rdma.

Absolutely. Especially with "vmsplice()" (the not-yet-implemented "move these user pages into a kernel buffer") it should be entirely possible to set up an efficient zero-copy setup that does NOT have any of the problems with aio and TLB shootdown etc.

Note that a driver would have to support the splice_in() and splice_out() interfaces (which are basically just given the pipe buffers to do with as they wish), and perhaps more importantly: note that you need specialized apps that actually use splice() to do this.

That's the biggest downside by far, and is why I'm not 100% convinced splice() usage will be all that wide-spread. If you look at sendfile(), it's been available for a long time, and is actually even almost portable across different OS's _and_ it is easy to use. But almost nobody actually does. I suspect the only users are some apache mods, perhaps a ftp deamon or two, and probably samba. And that's probably largely it.

I am. I'm developing a distributed file system responsible for
transferring GBs of files around a network. The biggest problem here
with the traditional send/recv/poll that was in use was heavy duty
CPU usage. Maxing out the gigabit network eats about 60% CPU. In
some simple experiments, sendfile reduced that to 10% or less (depending, there's a lot of variation in stuff that goes on).

One big problem I had is that sendfile is not symmetric (for quite
understable reasons), but that meant the overlying file system API
(it's a userspace library) has to undergo various changes to make
effective use of sendfile. Doing so in a sensible manner proved
tricky, but not impossible

Anyway, CPU usage is still a big deal, which is why I'm interested
in these new zero-copy calls I've just caught up on the discussion
about. And if I decide to use them, that means moving a whole
load of machines to 2.6.17 - some of which will be running 2.6.12
for at least a little while longer. I guess I might be asking
for the opposite of this:

So I'd expect this to be most useful for perhaps things like some HPC apps, where you can have specialized libraries for data communication. And servers, of course (but they might just continue to use the old "sendfile()" interface, without even knowing that it's not sendfile() any more, but just a wrapper around splice()).

i.e, a splice emulation, that happens to use sendfile when it can.

I very much appreciate the conceptual improvements that splice has
over sendfile, but can anyone give some examples significant CPU
savings that would not be possible using sendfile?









-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/