[PATCH v2 3/3] exec: do unshare_files after de_thread

From: Jeff Layton
Date: Thu Aug 30 2018 - 13:24:31 EST

POSIX mandates that open fds and their associated file locks should be
preserved across an execve. This works, unless the process is
multithreaded at the time that execve is called.

In that case, we'll end up unsharing the files_struct but the locks will
still have their fl_owner set to the address of the old one. Eventually,
when the other threads die and the last reference to the old
files_struct is put, any POSIX locks get torn down since it looks like
a close occurred on them.

The result is that all of your open files will be intact with none of
the locks you held before execve. The simple answer to this is "use OFD
locks", but this is a nasty surprise and it violates the spec.

Fix this by doing unshare_files later during exec, after we've already
killed off the other threads in the process. This helps ensure that we
only unshare the files_struct during exec when it is truly shared with
other processes.

Note that because the unshare_files call is now done just after
de_thread, we need a mechanism to pass the displaced files_struct back
up to __do_execve_file. This is done via a new displaced_files field
inside the linux_binprm.

Cc: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
Reported-by: Daniel P. Berrangà <berrange@xxxxxxxxxx>
Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
fs/exec.c | 11 ++++++-----
include/linux/binfmts.h | 1 +
2 files changed, 7 insertions(+), 5 deletions(-)

diff --git a/fs/exec.c b/fs/exec.c
index ca25f805ebad..e32bd1a33660 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1262,6 +1262,10 @@ int flush_old_exec(struct linux_binprm * bprm)
if (retval)
goto out;

+ retval = unshare_files(&bprm->displaced_files);
+ if (retval)
+ goto out;
* Must be called _before_ exec_mmap() as bprm->mm is
* not visibile until then. This also enables the update
@@ -1713,7 +1717,7 @@ static int __do_execve_file(int fd, struct filename *filename,
char *pathbuf = NULL;
struct linux_binprm *bprm;
- struct files_struct *displaced;
+ struct files_struct *displaced = NULL;
int retval;

if (IS_ERR(filename))
@@ -1735,10 +1739,6 @@ static int __do_execve_file(int fd, struct filename *filename,
* further execve() calls fail. */
current->flags &= ~PF_NPROC_EXCEEDED;

- retval = unshare_files(&displaced);
- if (retval)
- goto out_ret;
retval = -ENOMEM;
bprm = kzalloc(sizeof(*bprm), GFP_KERNEL);
if (!bprm)
@@ -1817,6 +1817,7 @@ static int __do_execve_file(int fd, struct filename *filename,
would_dump(bprm, bprm->file);

retval = exec_binprm(bprm);
+ displaced = bprm->displaced_files;
if (retval < 0)
goto out;

diff --git a/include/linux/binfmts.h b/include/linux/binfmts.h
index c05f24fac4f6..d7ec384bb1b0 100644
--- a/include/linux/binfmts.h
+++ b/include/linux/binfmts.h
@@ -49,6 +49,7 @@ struct linux_binprm {
unsigned int taso:1;
unsigned int recursion_depth; /* only for search_binary_handler() */
+ struct files_struct * displaced_files;
struct file * file;
struct cred *cred; /* new credentials */
int unsafe; /* how unsafe this exec is (mask of LSM_UNSAFE_*) */