[PATCH 3.16 204/410] pipe: fix limit checking in alloc_pipe_info()

From: Ben Hutchings
Date: Thu Jun 07 2018 - 10:53:47 EST


3.16.57-rc1 review patch. If anyone has any objections, please let me know.

------------------

From: "Michael Kerrisk (man-pages)" <mtk.manpages@xxxxxxxxx>

commit a005ca0e6813e1d796a7422a7e31d8b8d6555df1 upstream.

The limit checking in alloc_pipe_info() (used by pipe(2) and when
opening a FIFO) has the following problems:

(1) When checking capacity required for the new pipe, the checks against
the limit in /proc/sys/fs/pipe-user-pages-{soft,hard} are made
against existing consumption, and exclude the memory required for
the new pipe capacity. As a consequence: (1) the memory allocation
throttling provided by the soft limit does not kick in quite as
early as it should, and (2) the user can overrun the hard limit.

(2) As currently implemented, accounting and checking against the limits
is done as follows:

(a) Test whether the user has exceeded the limit.
(b) Make new pipe buffer allocation.
(c) Account new allocation against the limits.

This is racey. Multiple processes may pass point (a) simultaneously,
and then allocate pipe buffers that are accounted for only in step
(c). The race means that the user's pipe buffer allocation could be
pushed over the limit (by an arbitrary amount, depending on how
unlucky we were in the race). [Thanks to Vegard Nossum for spotting
this point, which I had missed.]

This patch addresses the above problems as follows:

* Alter the checks against limits to include the memory required for the
new pipe.
* Re-order the accounting step so that it precedes the buffer allocation.
If the accounting step determines that a limit has been reached, revert
the accounting and cause the operation to fail.

Link: http://lkml.kernel.org/r/8ff3e9f9-23f6-510c-644f-8e70cd1c0bd9@xxxxxxxxx
Signed-off-by: Michael Kerrisk <mtk.manpages@xxxxxxxxx>
Reviewed-by: Vegard Nossum <vegard.nossum@xxxxxxxxxx>
Cc: Willy Tarreau <w@xxxxxx>
Cc: <socketpair@xxxxxxxxx>
Cc: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
Cc: Jens Axboe <axboe@xxxxxx>
Cc: Al Viro <viro@xxxxxxxxxxxxxxxxxx>
Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
Signed-off-by: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
[bwh: Backported to 3.16: Don't use GFP_KERNEL_ACCOUNT]
Signed-off-by: Ben Hutchings <ben@xxxxxxxxxxxxxxx>
---
fs/pipe.c | 20 +++++++++++++-------
1 file changed, 13 insertions(+), 7 deletions(-)

--- a/fs/pipe.c
+++ b/fs/pipe.c
@@ -618,24 +618,30 @@ struct pipe_inode_info *alloc_pipe_info(
if (pipe == NULL)
goto out_free_uid;

- if (!too_many_pipe_buffers_hard(user)) {
- if (too_many_pipe_buffers_soft(user))
- pipe_bufs = 1;
- pipe->bufs = kcalloc(pipe_bufs,
- sizeof(struct pipe_buffer),
- GFP_KERNEL);
+ account_pipe_buffers(user, 0, pipe_bufs);
+
+ if (too_many_pipe_buffers_soft(user)) {
+ account_pipe_buffers(user, pipe_bufs, 1);
+ pipe_bufs = 1;
}

+ if (too_many_pipe_buffers_hard(user))
+ goto out_revert_acct;
+
+ pipe->bufs = kcalloc(pipe_bufs, sizeof(struct pipe_buffer),
+ GFP_KERNEL);
+
if (pipe->bufs) {
init_waitqueue_head(&pipe->wait);
pipe->r_counter = pipe->w_counter = 1;
pipe->buffers = pipe_bufs;
pipe->user = user;
- account_pipe_buffers(user, 0, pipe_bufs);
mutex_init(&pipe->mutex);
return pipe;
}

+out_revert_acct:
+ account_pipe_buffers(user, pipe_bufs, 0);
kfree(pipe);
out_free_uid:
free_uid(user);