read() syscall on a splice pipe returning -ENODATA

From: Abhijith Das
Date: Thu Feb 04 2016 - 14:48:20 EST


Hi all,

We're verifying writes to a gfs2 filesystem using splice reads, and we're seeing -ENODATA
being returned by the test program. The relevant bit of the test program looks like this:

static ssize_t
do_splice_read(int fd, void *buf, size_t nbytes, off_t offset)
{
int pfd[2]; /* pipes to use for vmsplice/splice */
int ret;
ssize_t n, m;
size_t left = nbytes;
size_t left_read;
off_t bufoff = 0;

if ((ret = pipe(pfd)) == -1) {
fprintf(stderr, "pipe() failed in d_splice_read(), %s\n", strerror(errno));
return -1;
}
do {
n = splice(fd, &offset, pfd[1], NULL, left, SPLICE_F_MOVE);
if (n < 0) {
close(pfd[0]);
close(pfd[1]);
return -1;
}
if (n == 0) break;
left_read = n;
do {
m = read(pfd[0], buf+bufoff, left_read);
if (m < 0) {
fprintf(stderr, "read in splicer failed, n = %zd left_read = %zd, errno = %d\n", n, left_read, errno);
close(pfd[0]);
close(pfd[1]);
return -1;
}
left_read -= m;
bufoff += m;
} while (left_read);
left -= n;
} while (left);
close(pfd[0]);
close(pfd[1]);

return nbytes - left;
}

I put a dump_stack() in the only place in fs/splice.c (page_cache_pipe_buf_confirm()) that
we return -ENODATA and the call stack looks like this:

[ 1201.662223] [<ffffffff81638424>] dump_stack+0x19/0x1b
[ 1201.663366] [<ffffffff8120ffaa>] page_cache_pipe_buf_confirm+0xba/0xc0
[ 1201.664818] [<ffffffff811e9b90>] pipe_read+0xf0/0x4c0
[ 1201.665988] [<ffffffff811e01fd>] do_sync_read+0x8d/0xd0
[ 1201.667170] [<ffffffff811e095c>] vfs_read+0x9c/0x170
[ 1201.668290] [<ffffffff811e14af>] SyS_read+0x7f/0xe0
[ 1201.669401] [<ffffffff81648c09>] system_call_fastpath+0x16/0x1b

So, the read(2) call following the splice(2) call triggers the -ENODATA. Upon putting a retry
loop in the test program on -ENODATA, the subsequent read(2) call succeeds.

I read through the splice/pipe code and have a few questions that I couldn't quite figure out
the answers to:

1. According to the description of the "confirm" function in struct pipe_buf_operations in
include/linux/pipe_fs_i.h, shouldn't page_cache_pipe_buf_confirm() be waiting for I/O
completion and/or trying to re-populate the page from the fs if it has been truncated?

2. In splice.c, page_cache_pipe_buf_confirm() is called by splice_from_pipe_feed() which
turns a -ENODATA return value to 0 and returns it to the caller generic_file_splice_write().
If -ENODATA is returned on the first page that is processed, generic_file_splice_write() would
return 0 (EOF). Is this valid or should we attempt to retry as per 1. above?

3. When page_cache_pipe_buf_confirm() is called via splice_from_pipe_feed(), -ENODATA is being
(correctly?) handled, but when it is being called through vfs_read()->pipe_read() we simply
return back the -ENODATA error code back to the user. Is this correct behavior? If so, how
should the user program deal with it?

Cheers!
--Abhi