sendfile() expert advice sought
From: Patrick J. LoPresti
Date: Tue Feb 16 2010 - 14:53:31 EST
Executive summary: Can I get the benefits of sendfile() for anonymous pages?
I have an application that generates hundreds of gigabytes of data per
hour. I want to push that data out over a TCP socket. (The network
connection will be fast; multiple bonded GigE lines or 10GigE.)
I gather that sendfile() is pretty efficient, so I would like to use
it. But I do not want to write all of my data to disk first. So I am
considering an approach like this:
int fd = shm_open("/foo", O_RDWR|O_TRUNC);
ftruncate(fd, length);
void *p = mmap (0, length, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
// (fill memory block at p with some data)
sendfile(fd, sock, 0, length);
Questions:
1) Will this work at all? (Some on-line sources suggest sendfile()
does not work with tmpfs files. But I think this was fixed at some
point...)
2) Will it provide zero-copy behavior, or does the fact that the pages
are mapped in my process cause sendfile() to copy them?
3) If it is zero-copy, what happens if I overwrite the memory block
after sendfile() returns? Do I risk corrupting my data? (In
particular, suppose I have TCP_CORK set on the socket. Will
sendfile() return before all of the data has actually been sent,
giving me a window to corrupt my data? If so, how do I know when it
is "safe" to re-use the memory?)
4) If sendfile() is not zero-copy in this example, would I expect a
performance boost anyway, because sendfile() does not need to crawl
page tables or something?
Any responses or references will be appreciated.
Thanks!
- Pat
P.S. I know I could also try mmap()'ing "/dev/zero" and using
vmsplice(). Same set of questions, though.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/