[PATCH 5.8 016/124] splice: teach splice pipe reading about empty pipe buffers

From: Greg Kroah-Hartman
Date: Mon Oct 12 2020 - 09:46:26 EST


From: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>

commit d1a819a2ec2d3b5e6a8f8a9f67386bda0ad315bc upstream.

Tetsuo Handa reports that splice() can return 0 before the real EOF, if
the data in the splice source pipe is an empty pipe buffer. That empty
pipe buffer case doesn't happen in any normal situation, but you can
trigger it by doing a write to a pipe that fails due to a page fault.

Tetsuo has a test-case to show the behavior:

#define _GNU_SOURCE
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>

int main(int argc, char *argv[])
{
const int fd = open("/tmp/testfile", O_WRONLY | O_CREAT, 0600);
int pipe_fd[2] = { -1, -1 };
pipe(pipe_fd);
write(pipe_fd[1], NULL, 4096);
/* This splice() should wait unless interrupted. */
return !splice(pipe_fd[0], NULL, fd, NULL, 65536, 0);
}

which results in

write(5, NULL, 4096) = -1 EFAULT (Bad address)
splice(4, NULL, 3, NULL, 65536, 0) = 0

and this can confuse splice() users into believing they have hit EOF
prematurely.

The issue was introduced when the pipe write code started pre-allocating
the pipe buffers before copying data from user space.

This is modified verion of Tetsuo's original patch.

Fixes: a194dfe6e6f6 ("pipe: Rearrange sequence in pipe_write() to preallocate slot")
Link:https://lore.kernel.org/linux-fsdevel/20201005121339.4063-1-penguin-kernel@xxxxxxxxxxxxxxxxxxx/
Reported-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Acked-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>

---
fs/splice.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

--- a/fs/splice.c
+++ b/fs/splice.c
@@ -526,6 +526,22 @@ static int splice_from_pipe_feed(struct
return 1;
}

+/* We know we have a pipe buffer, but maybe it's empty? */
+static inline bool eat_empty_buffer(struct pipe_inode_info *pipe)
+{
+ unsigned int tail = pipe->tail;
+ unsigned int mask = pipe->ring_size - 1;
+ struct pipe_buffer *buf = &pipe->bufs[tail & mask];
+
+ if (unlikely(!buf->len)) {
+ pipe_buf_release(pipe, buf);
+ pipe->tail = tail+1;
+ return true;
+ }
+
+ return false;
+}
+
/**
* splice_from_pipe_next - wait for some data to splice from
* @pipe: pipe to splice from
@@ -545,6 +561,7 @@ static int splice_from_pipe_next(struct
if (signal_pending(current))
return -ERESTARTSYS;

+repeat:
while (pipe_empty(pipe->head, pipe->tail)) {
if (!pipe->writers)
return 0;
@@ -566,6 +583,9 @@ static int splice_from_pipe_next(struct
pipe_wait_readable(pipe);
}

+ if (eat_empty_buffer(pipe))
+ goto repeat;
+
return 1;
}