Re: sendfile and EAGAIN

From: Eric Dumazet
Date: Mon Feb 25 2013 - 14:23:06 EST


On Mon, 2013-02-25 at 12:22 -0500, Ulrich Drepper wrote:
> When using sendfile with a non-blocking output file descriptor for a
> socket the operation can cause a partial write because of capacity
> issues. This is nothing critical and the operation could resume after
> the output queue is cleared. The problem is: there is no way to
> determine where to resume.
>
> The system call just returns -EAGAIN without any further indication.
> The caller doesn't know what to resend.
>
> And this even though the interface of sendfile would be capable of
> communicating this information and the man page (I know, it's not
> authoritive) describes this behavior as well.
>
> The problem is probably in a few places, here is one (fs/splice.c):
>
> static ssize_t default_file_splice_write(struct pipe_inode_info *pipe,
> struct file *out, loff_t *ppos,
> size_t len, unsigned int flags)
> {
> ssize_t ret;
>
> ret = splice_from_pipe(pipe, out, ppos, len, flags, write_pipe_buf);
> if (ret > 0)
> *ppos += ret;
>
> return ret;
> }
>
> Note that *ppos is only updated if the call doesn't fail. We could
> also update the position if ret == -EAGAIN. This would require
> re-architecting the system a bit to either update *ppos in
> splice_from_pipe etc or to communicate number of the bytes which are
> written from the splice_from_pipe call. In any case, the result would
> be that the caller knows where to resume the operation.
>
> I would argue that this doesn't break the ABI. In case existing
> programs today just resend packages today from the beginning they will
> have send an unpredictable number of bytes in the previous sendfile()
> call, making the state of the communication unpredictable.
>
> Opinions? I think as is sendfile() isn't useful with O_NONBLOCK.
> --

I don't understand the issue.

sendfile() returns -EAGAIN only if no bytes were copied to the socket.

If some bytes were copied, sendfile() returns the number of bytes,
exactly like write() would do for a partial write.

I guess the following should work (well... with better tests)

offset = 0;
while (offset < len) {
res = sendfile(sock, fd, &offset, len - offset);
if (res >= 0) {
offset += res;
} else {
if (errno != EAGAIN)
break;
wait_some_event();
}
}




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/